Message ID | 1545318971-28351-3-git-send-email-sgruszka@redhat.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Kalle Valo |
Headers | show |
Series | [v2,1/3] rt2x00: use ratelimited variants dev_warn/dev_err | expand |
Shouldn't you now also revert the commits from https://git.openwrt.org/?p=openwrt/staging/dangole.git;a=commitdiff;h=de1c58a64bd66319e770d2587da07d8c9c90174a since they caused throughput regression? On 20/12/2018, Stanislaw Gruszka <sgruszka@redhat.com> wrote: > For unknown reasons printk() on some context can cause CPU hung on > embedded MT7620 AP/router MIPS platforms. What can result on wifi > disconnects. > > This patch move queue full messages to debug level what is consistent > with other mac80211 drivers which drop packet silently if tx queue is > full. This make MT7620 OpenWRT routers more stable, what was reported > by various users. > > Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> > --- > drivers/net/wireless/ralink/rt2x00/rt2x00queue.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/net/wireless/ralink/rt2x00/rt2x00queue.c > b/drivers/net/wireless/ralink/rt2x00/rt2x00queue.c > index 92ddc19e7bf7..947fc8964e9a 100644 > --- a/drivers/net/wireless/ralink/rt2x00/rt2x00queue.c > +++ b/drivers/net/wireless/ralink/rt2x00/rt2x00queue.c > @@ -671,7 +671,7 @@ int rt2x00queue_write_tx_frame(struct data_queue *queue, > struct sk_buff *skb, > spin_lock(&queue->tx_lock); > > if (unlikely(rt2x00queue_full(queue))) { > - rt2x00_err(queue->rt2x00dev, "Dropping frame due to full tx queue %d\n", > + rt2x00_dbg(queue->rt2x00dev, "Dropping frame due to full tx queue %d\n", > queue->qid); > ret = -ENOBUFS; > goto out; > -- > 2.7.5 > >
On Thu, Dec 20, 2018 at 06:52:03PM +0100, Tom Psyborg wrote: > Shouldn't you now also revert the commits from > https://git.openwrt.org/?p=openwrt/staging/dangole.git;a=commitdiff;h=de1c58a64bd66319e770d2587da07d8c9c90174a > since they caused throughput regression? For the record: this is not about currently posted patch and you reference this: https://lore.kernel.org/linux-wireless/1538697102-3764-1-git-send-email-pozega.tomislav@gmail.com/ I'm not convinced for reverting. I would rater fix the problem on top of the patches. Looking at those images, the performance vary also on "good" case and there is entry 12.6 Mbits/s there, where the lowest entry on "bad" case is 32.5 Mbits/s . However, yes, if performance do not drop for unknown reason, it is 51.4 Mbits/s with patches and 56.6 Mbits/s on good case. Could you specify after applying which one of those 5 patches throughput regression starts ? Thanks Stanislaw
On 21/12/2018, Stanislaw Gruszka <sgruszka@redhat.com> wrote: > On Thu, Dec 20, 2018 at 06:52:03PM +0100, Tom Psyborg wrote: >> Shouldn't you now also revert the commits from >> https://git.openwrt.org/?p=openwrt/staging/dangole.git;a=commitdiff;h=de1c58a64bd66319e770d2587da07d8c9c90174a >> since they caused throughput regression? > > For the record: this is not about currently posted patch and you > reference this: > https://lore.kernel.org/linux-wireless/1538697102-3764-1-git-send-email-pozega.tomislav@gmail.com/ > > I'm not convinced for reverting. I would rater fix the problem > on top of the patches. > Let me get this straight: You made these patches to fix wlan stalls when "arrived at non-free entry" error message appears, but in turn they result in lower throughput. Now you are moving err printks to dbg level since the printks themselfs are causing wlan problems. Why would you waste life fixing patches that don't do any good? Just revert all of them! > Looking at those images, the performance vary also on "good" case and > there is entry 12.6 Mbits/s there, where the lowest entry on "bad" case > is 32.5 Mbits/s . However, yes, if performance do not drop for unknown > reason, it is 51.4 Mbits/s with patches and 56.6 Mbits/s on good case. > > Could you specify after applying which one of those 5 patches > throughput regression starts ? > > Thanks > Stanislaw > I cannot test that ATM but the best would be to revert all of them. And, for the future reference, please try sending them to the openwrt-devel list first for testing before deciding to go upstream. Regards, Tom
Even with your patches that are currently being tested i had interface frozen yesterday. 2 android pads, 2 android phones and laptop with intel card on win10. Wlan frozen about half an hour after bootup, had to restart interface that revelaed more queue problems: [ 238.715836] IPv6: ADDRCONF(NETDEV_CHANGE): wlan1: link becomes ready [ 238.722681] br-lan: port 2(wlan1) entered blocking state [ 238.728107] br-lan: port 2(wlan1) entered forwarding state [ 2403.051342] device wlan1 left promiscuous mode [ 2403.056021] br-lan: port 2(wlan1) entered disabled state [ 2403.213039] ieee80211 phy1: rt2800_config_channel: Warning - Using incomplete support for external PA [ 2403.522622] ieee80211 phy1: rt2x00queue_flush_queue: Warning - Queue 0 failed to flush [ 2403.817143] ieee80211 phy1: rt2x00queue_flush_queue: Warning - Queue 1 failed to flush [ 2404.058047] ieee80211 phy1: rt2x00queue_flush_queue: Warning - Queue 2 failed to flush [ 2404.299509] ieee80211 phy1: rt2x00queue_flush_queue: Warning - Queue 3 failed to flush [ 2404.566195] ieee80211 phy1: rt2x00queue_flush_queue: Warning - Queue 0 failed to flush [ 2404.806880] ieee80211 phy1: rt2x00queue_flush_queue: Warning - Queue 1 failed to flush [ 2405.047624] ieee80211 phy1: rt2x00queue_flush_queue: Warning - Queue 2 failed to flush [ 2405.288195] ieee80211 phy1: rt2x00queue_flush_queue: Warning - Queue 3 failed to flush
On Sat, Dec 22, 2018 at 02:12:13PM +0100, Tom Psyborg wrote: > On 21/12/2018, Stanislaw Gruszka <sgruszka@redhat.com> wrote: > > On Thu, Dec 20, 2018 at 06:52:03PM +0100, Tom Psyborg wrote: > >> Shouldn't you now also revert the commits from > >> https://git.openwrt.org/?p=openwrt/staging/dangole.git;a=commitdiff;h=de1c58a64bd66319e770d2587da07d8c9c90174a > >> since they caused throughput regression? > > > > For the record: this is not about currently posted patch and you > > reference this: > > https://lore.kernel.org/linux-wireless/1538697102-3764-1-git-send-email-pozega.tomislav@gmail.com/ > > > > I'm not convinced for reverting. I would rater fix the problem > > on top of the patches. > > > > Let me get this straight: You made these patches to fix wlan stalls > when "arrived at non-free entry" error message appears, but in turn > they result in lower throughput. Now you are moving err printks to dbg > level since the printks themselfs are causing wlan problems. Why would > you waste life fixing patches that don't do any good? Just revert all > of them! I move prints to debug level for for "arrived at non-free entry" from "Dropping frame due to full tx queue" . For some users those 5 patches improve things, they fix problem of router connection hung completely. Printk problem is connection stall due to CPU being busy, but is possible to reconnect to AP after that. It's different issue. Also 'result in lower throughput' is over-generalized statement. More precise would be 'it lower throughput on some test cases', but even in those specific cases, throughput vary randomly with and without patches. > > Looking at those images, the performance vary also on "good" case and > > there is entry 12.6 Mbits/s there, where the lowest entry on "bad" case > > is 32.5 Mbits/s . However, yes, if performance do not drop for unknown > > reason, it is 51.4 Mbits/s with patches and 56.6 Mbits/s on good case. > > > > Could you specify after applying which one of those 5 patches > > throughput regression starts ? > > > > Thanks > > Stanislaw > > > > I cannot test that ATM but the best would be to revert all of them. Ok, could anyone else confirm this throughput regression and point offending commit ? > And, for the future reference, please try sending them to the > openwrt-devel list first for testing before deciding to go upstream. Sure, I can send patches to openwrt-devel. However, for the record, I informed about those patches interested people including you in August: https://lore.kernel.org/linux-wireless/20180815114029.GA1862@redhat.com/ month before I submitted them upstream. Thanks Stanislaw
On Tue, Dec 25, 2018 at 11:43:09PM +0100, Tom Psyborg wrote: > Even with your patches that are currently being tested i had interface > frozen yesterday. 2 android pads, 2 android phones and laptop with > intel card on win10. Wlan frozen about half an hour after bootup, had > to restart interface that revelaed more queue problems: > > [ 238.715836] IPv6: ADDRCONF(NETDEV_CHANGE): wlan1: link becomes ready > [ 238.722681] br-lan: port 2(wlan1) entered blocking state > [ 238.728107] br-lan: port 2(wlan1) entered forwarding state > [ 2403.051342] device wlan1 left promiscuous mode > [ 2403.056021] br-lan: port 2(wlan1) entered disabled state > [ 2403.213039] ieee80211 phy1: rt2800_config_channel: Warning - Using > incomplete support for external PA Yes, patches will not help if device is not programed properly, i.e. when external PA is not well configured. I already pointed this here: https://lore.kernel.org/linux-wireless/20180815114029.GA1862@redhat.com/ Register programming should be fixed there and also whould be good to implement watchdog to recover from wifi hung it it happens. > [ 2403.522622] ieee80211 phy1: rt2x00queue_flush_queue: Warning - > Queue 0 failed to flush > [ 2403.817143] ieee80211 phy1: rt2x00queue_flush_queue: Warning - > Queue 1 failed to flush > [ 2404.058047] ieee80211 phy1: rt2x00queue_flush_queue: Warning - > Queue 2 failed to flush > [ 2404.299509] ieee80211 phy1: rt2x00queue_flush_queue: Warning - > Queue 3 failed to flush > [ 2404.566195] ieee80211 phy1: rt2x00queue_flush_queue: Warning - > Queue 0 failed to flush > [ 2404.806880] ieee80211 phy1: rt2x00queue_flush_queue: Warning - > Queue 1 failed to flush > [ 2405.047624] ieee80211 phy1: rt2x00queue_flush_queue: Warning - > Queue 2 failed to flush > [ 2405.288195] ieee80211 phy1: rt2x00queue_flush_queue: Warning - > Queue 3 failed to flush Hmm, this one should be gone. Could you check attached patch? Thanks Stanislaw diff --git a/drivers/net/wireless/ralink/rt2x00/rt2800soc.c b/drivers/net/wireless/ralink/rt2x00/rt2800soc.c index a502816214ab..05a801774713 100644 --- a/drivers/net/wireless/ralink/rt2x00/rt2800soc.c +++ b/drivers/net/wireless/ralink/rt2x00/rt2800soc.c @@ -203,7 +203,7 @@ static const struct rt2x00lib_ops rt2800soc_rt2x00_ops = { .start_queue = rt2800mmio_start_queue, .kick_queue = rt2800mmio_kick_queue, .stop_queue = rt2800mmio_stop_queue, - .flush_queue = rt2x00mmio_flush_queue, + .flush_queue = rt2800mmio_flush_queue, .write_tx_desc = rt2800mmio_write_tx_desc, .write_tx_data = rt2800_write_tx_data, .write_beacon = rt2800_write_beacon,
> > I move prints to debug level for for "arrived at non-free entry" from > "Dropping frame due to full tx queue" . For some users those 5 patches > improve things, they fix problem of router connection hung completely. > Printk problem is connection stall due to CPU being busy, but > is possible to reconnect to AP after that. It's different issue. > yes they fix hung. same hung that does this single patch fix by moving to dbg so they're not printed at all. FYI "Dropping frame due to full tx queue" causes "arrived at non-free entry" eventually that freezes interface so only reboot helps, removing "Dropping frame due to full tx queue" from printing to logs prevents this kind of hung and is better solution than 5 patches. oh and don't forget to add note that rt2x00_err/rt2x00_dbg prints can cause serious bug so you don't acidentally enable them again in a year or two! > Also 'result in lower throughput' is over-generalized statement. More > precise would be 'it lower throughput on some test cases', but even > in those specific cases, throughput vary randomly with and without > patches. > 100Mbps TX 70Mbps RX without and 80Mbps TX 55 Mbps RX with 5 patches--> EA2750 HT40 CH2 4.14.82+4.19-rc5-1-1
On Fri, Dec 28, 2018 at 01:45:27AM +0100, Tom Psyborg wrote: > > > > I move prints to debug level for for "arrived at non-free entry" from > > "Dropping frame due to full tx queue" . For some users those 5 patches > > improve things, they fix problem of router connection hung completely. > > Printk problem is connection stall due to CPU being busy, but > > is possible to reconnect to AP after that. It's different issue. > > > yes they fix hung. same hung that does this single patch fix by moving > to dbg so they're not printed at all. FYI "Dropping frame due to full > tx queue" causes "arrived at non-free entry" eventually that freezes > interface so only reboot helps, removing "Dropping frame due to full > tx queue" from printing to logs prevents this kind of hung and is > better solution than 5 patches. I wonder why are you confident about this ? And also why 5 patches fixed the hung if there were still "Dropping frame ..." prints ? Anyway I'll ask people on the bz for confirmation, if stop printing is sufficient to fix the wifi hung. > oh and don't forget to add note that > rt2x00_err/rt2x00_dbg prints can cause serious bug so you don't > acidentally enable them again in a year or two! I think I will remember that, for others git changelog should be sufficient. > > Also 'result in lower throughput' is over-generalized statement. More > > precise would be 'it lower throughput on some test cases', but even > > in those specific cases, throughput vary randomly with and without > > patches. > > > > 100Mbps TX 70Mbps RX without and 80Mbps TX 55 Mbps RX with 5 > patches--> EA2750 HT40 CH2 4.14.82+4.19-rc5-1-1 Ok, this is different case that you presented before on USB devices. And I'm more concerned about it, because changes with MMIO ware much bigger (for USB only some code was moved and some flush changed were done). Please check attached patch, if it fixes throughput regression for you ? Thanks Stanislaw diff --git a/drivers/net/wireless/ralink/rt2x00/rt2800lib.c b/drivers/net/wireless/ralink/rt2x00/rt2800lib.c index fe365cda0841..17e87e9b3d19 100644 --- a/drivers/net/wireless/ralink/rt2x00/rt2800lib.c +++ b/drivers/net/wireless/ralink/rt2x00/rt2800lib.c @@ -961,8 +961,7 @@ static bool rt2800_txdone_entry_check(struct queue_entry *entry, u32 reg) { __le32 *txwi; u32 word; - int wcid, ack, pid; - int tx_wcid, tx_ack, tx_pid, is_agg; + int wcid, tx_wcid; /* * This frames has returned with an IO error, @@ -972,23 +971,17 @@ static bool rt2800_txdone_entry_check(struct queue_entry *entry, u32 reg) if (test_bit(ENTRY_DATA_IO_FAILED, &entry->flags)) return false; - wcid = rt2x00_get_field32(reg, TX_STA_FIFO_WCID); - ack = rt2x00_get_field32(reg, TX_STA_FIFO_TX_ACK_REQUIRED); - pid = rt2x00_get_field32(reg, TX_STA_FIFO_PID_TYPE); - is_agg = rt2x00_get_field32(reg, TX_STA_FIFO_TX_AGGRE); - /* * Validate if this TX status report is intended for - * this entry by comparing the WCID/ACK/PID fields. + * this entry by comparing the WCID field. */ txwi = rt2800_drv_get_txwi(entry); - word = rt2x00_desc_read(txwi, 1); tx_wcid = rt2x00_get_field32(word, TXWI_W1_WIRELESS_CLI_ID); - tx_ack = rt2x00_get_field32(word, TXWI_W1_ACK); - tx_pid = rt2x00_get_field32(word, TXWI_W1_PACKETID); - if (wcid != tx_wcid || ack != tx_ack || (!is_agg && pid != tx_pid)) { + wcid = rt2x00_get_field32(reg, TX_STA_FIFO_WCID); + + if (wcid != tx_wcid) { rt2x00_dbg(entry->queue->rt2x00dev, "TX status report missed for queue %d entry %d\n", entry->queue->qid, entry->entry_idx);
On Wed, Jan 02, 2019 at 09:19:34AM +0100, Stanislaw Gruszka wrote: > On Fri, Dec 28, 2018 at 01:45:27AM +0100, Tom Psyborg wrote: > > > > > > I move prints to debug level for for "arrived at non-free entry" from > > > "Dropping frame due to full tx queue" . For some users those 5 patches > > > improve things, they fix problem of router connection hung completely. > > > Printk problem is connection stall due to CPU being busy, but > > > is possible to reconnect to AP after that. It's different issue. > > > > > yes they fix hung. same hung that does this single patch fix by moving > > to dbg so they're not printed at all. FYI "Dropping frame due to full > > tx queue" causes "arrived at non-free entry" eventually that freezes > > interface so only reboot helps, removing "Dropping frame due to full > > tx queue" from printing to logs prevents this kind of hung and is > > better solution than 5 patches. > > I wonder why are you confident about this ? And also why 5 patches > fixed the hung if there were still "Dropping frame ..." prints ? > > Anyway I'll ask people on the bz for confirmation, if stop printing > is sufficient to fix the wifi hung. > > > oh and don't forget to add note that > > rt2x00_err/rt2x00_dbg prints can cause serious bug so you don't > > acidentally enable them again in a year or two! > > I think I will remember that, for others git changelog should > be sufficient. > > > > Also 'result in lower throughput' is over-generalized statement. More > > > precise would be 'it lower throughput on some test cases', but even > > > in those specific cases, throughput vary randomly with and without > > > patches. > > > > > > > 100Mbps TX 70Mbps RX without and 80Mbps TX 55 Mbps RX with 5 > > patches--> EA2750 HT40 CH2 4.14.82+4.19-rc5-1-1 > > Ok, this is different case that you presented before on USB devices. > And I'm more concerned about it, because changes with MMIO ware > much bigger (for USB only some code was moved and some flush > changed were done). Please check attached patch, if it fixes > throughput regression for you ? So what is the point complaining that I do not provide patches for testing and when I do, ignore that ? Anyway I have now MT7620A device and was capable to reproduce the performance regression. Will provide patches to fix it and still prevent tx queues hungs. Stanislaw
Can you reproduce interface freeze with this patch if you enable debug options to print all dbg messages?
On Sat, Feb 09, 2019 at 12:11:03PM +0100, Tom Psyborg wrote: > Can you reproduce interface freeze with this patch if you enable debug > options to print all dbg messages? Haven't tried that, but I expect enabling debug messages will cause troubles at hardware where printk make CPU hog. Stanislaw
On 09/02/2019, Stanislaw Gruszka <sgruszka@redhat.com> wrote: > On Sat, Feb 09, 2019 at 12:11:03PM +0100, Tom Psyborg wrote: >> Can you reproduce interface freeze with this patch if you enable debug >> options to print all dbg messages? > > Haven't tried that, but I expect enabling debug messages will > cause troubles at hardware where printk make CPU hog. > > Stanislaw > Take a look at my comment from 23.01.2019. on this link: https://bugs.openwrt.org/index.php?do=details&task_id=2018 There might be a chance dbg printks do not cause problems.
Hi Tom, On Sat, Feb 09, 2019 at 01:28:32PM +0100, Tom Psyborg wrote: > On 09/02/2019, Stanislaw Gruszka <sgruszka@redhat.com> wrote: > > On Sat, Feb 09, 2019 at 12:11:03PM +0100, Tom Psyborg wrote: > >> Can you reproduce interface freeze with this patch if you enable debug > >> options to print all dbg messages? > > > > Haven't tried that, but I expect enabling debug messages will > > cause troubles at hardware where printk make CPU hog. > > > > Stanislaw > > > > Take a look at my comment from 23.01.2019. on this link: > https://bugs.openwrt.org/index.php?do=details&task_id=2018 > There might be a chance dbg printks do not cause problems. This is very likely just overheat because of missing TSSI. It happends on boards where MT7620A is used without heatsink (and easy to reproduce, I've tried. We just need TSSI and for bad board design it will mean limiting to one TX-queue soon after serious traffic starts). We should tell those users to glue heatsink on the chip for $0.05 they can fix this in hardware -- and implement TSSI to mitigate at least the very worst effects (such as hardware stuck until reset). So dkg printks can make this worse, of course, because logging overhead also burns CPU cycles and increases system load. I remember in Rt5350 (and other predecessors of Rt6352 aka. MT7620) also had problems like that and it was possible to improve it by all sorts of things like removing power from unused ports of the Ethernet switch -- so for those boards (like Xiaomi MiWiFi mini) even this maybe the trick, because they also only use 3 out of 5 ports, and maybe we need to try harder to disable the power of unused ports (and also those without cable plugged in only periodicly power them up and check for link).... Cheers Daniel
On 09/02/2019, Daniel Golle <daniel@makrotopia.org> wrote: > Hi Tom, > > On Sat, Feb 09, 2019 at 01:28:32PM +0100, Tom Psyborg wrote: >> On 09/02/2019, Stanislaw Gruszka <sgruszka@redhat.com> wrote: >> > On Sat, Feb 09, 2019 at 12:11:03PM +0100, Tom Psyborg wrote: >> >> Can you reproduce interface freeze with this patch if you enable debug >> >> options to print all dbg messages? >> > >> > Haven't tried that, but I expect enabling debug messages will >> > cause troubles at hardware where printk make CPU hog. >> > >> > Stanislaw >> > >> >> Take a look at my comment from 23.01.2019. on this link: >> https://bugs.openwrt.org/index.php?do=details&task_id=2018 >> There might be a chance dbg printks do not cause problems. > > This is very likely just overheat because of missing TSSI. It happends > on boards where MT7620A is used without heatsink (and easy to > reproduce, I've tried. We just need TSSI and for bad board design it > will mean limiting to one TX-queue soon after serious traffic starts). > We should tell those users to glue heatsink on the chip for $0.05 they > can fix this in hardware -- and implement TSSI to mitigate at least the > very worst effects (such as hardware stuck until reset). > So dkg printks can make this worse, of course, because logging overhead > also burns CPU cycles and increases system load. > > I remember in Rt5350 (and other predecessors of Rt6352 aka. MT7620) > also had problems like that and it was possible to improve it by all > sorts of things like removing power from unused ports of the Ethernet > switch -- so for those boards (like Xiaomi MiWiFi mini) even this maybe > the trick, because they also only use 3 out of 5 ports, and maybe we > need to try harder to disable the power of unused ports (and also those > without cable plugged in only periodicly power them up and check for > link).... > > Cheers > > Daniel > Disagree. MT7620 in Xiaomi MiWiFi mini has thermal pad sticked to enough large metal shielding that is sticked with another thermal pad to even bigger metal part of case. So thermal dissipation in that case is very good. Also with pandorabox firmware 200Mbps of stable throughput is possible without any throttling which I believe would not be the case if the cooling system wasn't reliable. Back on subject, this patch is tested and proven to work, so ACK from me.
Hi Tom, On Sat, Feb 09, 2019 at 05:29:33PM +0100, Tom Psyborg wrote: > On 09/02/2019, Daniel Golle <daniel@makrotopia.org> wrote: > > Hi Tom, > > > > On Sat, Feb 09, 2019 at 01:28:32PM +0100, Tom Psyborg wrote: > >> On 09/02/2019, Stanislaw Gruszka <sgruszka@redhat.com> wrote: > >> > On Sat, Feb 09, 2019 at 12:11:03PM +0100, Tom Psyborg wrote: > >> >> Can you reproduce interface freeze with this patch if you enable debug > >> >> options to print all dbg messages? > >> > > >> > Haven't tried that, but I expect enabling debug messages will > >> > cause troubles at hardware where printk make CPU hog. > >> > > >> > Stanislaw > >> > > >> > >> Take a look at my comment from 23.01.2019. on this link: > >> https://bugs.openwrt.org/index.php?do=details&task_id=2018 > >> There might be a chance dbg printks do not cause problems. > > > > This is very likely just overheat because of missing TSSI. It happends > > on boards where MT7620A is used without heatsink (and easy to > > reproduce, I've tried. We just need TSSI and for bad board design it > > will mean limiting to one TX-queue soon after serious traffic starts). > > We should tell those users to glue heatsink on the chip for $0.05 they > > can fix this in hardware -- and implement TSSI to mitigate at least the > > very worst effects (such as hardware stuck until reset). > > So dkg printks can make this worse, of course, because logging overhead > > also burns CPU cycles and increases system load. > > > > I remember in Rt5350 (and other predecessors of Rt6352 aka. MT7620) > > also had problems like that and it was possible to improve it by all > > sorts of things like removing power from unused ports of the Ethernet > > switch -- so for those boards (like Xiaomi MiWiFi mini) even this maybe > > the trick, because they also only use 3 out of 5 ports, and maybe we > > need to try harder to disable the power of unused ports (and also those > > without cable plugged in only periodicly power them up and check for > > link).... > > > > Cheers > > > > Daniel > > > > Disagree. MT7620 in Xiaomi MiWiFi mini has thermal pad sticked to > enough large metal shielding that is sticked with another thermal pad > to even bigger metal part of case. So thermal dissipation in that case > is very good. Good to hear that, and yes it may still be, see below. > Also with pandorabox firmware 200Mbps of stable throughput is possible >20MiB/s download rate is very impressive for 2.4GHz WiFi, I never saw anything even near that (using BitTorrent I get up to 8MiB/s when next to the router in low-noise environment with proprietry ubiquiti gear). > without any throttling which I believe would not be the case if the > cooling system wasn't reliable. pandorabox uses MediaTek proprietary driver and that limits to single TX chain operation once temperature limit is hit (read the code, you know where), same as in Xiaomi proprietary firmware. > > Back on subject, this patch is tested and proven to work, so ACK from me. Thanks for testing. I will push Stanislavs complete series into OpenWrt tomorrow. Cheers Daniel
On Sat, Feb 09, 2019 at 01:28:32PM +0100, Tom Psyborg wrote: > On 09/02/2019, Stanislaw Gruszka <sgruszka@redhat.com> wrote: > > On Sat, Feb 09, 2019 at 12:11:03PM +0100, Tom Psyborg wrote: > >> Can you reproduce interface freeze with this patch if you enable debug > >> options to print all dbg messages? > > > > Haven't tried that, but I expect enabling debug messages will > > cause troubles at hardware where printk make CPU hog. > > > > Stanislaw > > > > Take a look at my comment from 23.01.2019. on this link: > https://bugs.openwrt.org/index.php?do=details&task_id=2018 > There might be a chance dbg printks do not cause problems. I'm sure printk cause problems on some hardware i.e. Wt3020 . There is hypothesis that is because kernel try to emit messages by misconfigured 56k serial line, but who knows. This should be debugged and fixed but I have not idea how to do this. Another issue not related with printk is tx queue hung, when queue is full but we do not get any TX status interrupt. This was mitigated in my previous set that caused throughput regression and should be now fixed by latest RFC/RTF set. But there are other issues, that could cause troubles. Stanislaw
On 09/02/2019, Daniel Golle <daniel@makrotopia.org> wrote: >>20MiB/s download rate is very impressive for 2.4GHz WiFi, I never saw > anything even near that (using BitTorrent I get up to 8MiB/s when next > to the router in low-noise environment with proprietry ubiquiti gear). various apps and data transfer stacks may add significant overhead. for real throughput testing i rely on iperf2 (tried iperf3 it was mess). if interested take a look at bandwidth achieved on OCed QCA9531 device: https://forum.openwrt.org/t/different-wifi-speed-tx-rx-archer-c7-v2/18860/10?u=psyborg
diff --git a/drivers/net/wireless/ralink/rt2x00/rt2x00queue.c b/drivers/net/wireless/ralink/rt2x00/rt2x00queue.c index 92ddc19e7bf7..947fc8964e9a 100644 --- a/drivers/net/wireless/ralink/rt2x00/rt2x00queue.c +++ b/drivers/net/wireless/ralink/rt2x00/rt2x00queue.c @@ -671,7 +671,7 @@ int rt2x00queue_write_tx_frame(struct data_queue *queue, struct sk_buff *skb, spin_lock(&queue->tx_lock); if (unlikely(rt2x00queue_full(queue))) { - rt2x00_err(queue->rt2x00dev, "Dropping frame due to full tx queue %d\n", + rt2x00_dbg(queue->rt2x00dev, "Dropping frame due to full tx queue %d\n", queue->qid); ret = -ENOBUFS; goto out;
For unknown reasons printk() on some context can cause CPU hung on embedded MT7620 AP/router MIPS platforms. What can result on wifi disconnects. This patch move queue full messages to debug level what is consistent with other mac80211 drivers which drop packet silently if tx queue is full. This make MT7620 OpenWRT routers more stable, what was reported by various users. Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> --- drivers/net/wireless/ralink/rt2x00/rt2x00queue.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)