diff mbox series

wifi: rtlwifi: Drastically reduce the attempts to read efuse bytes in case of failures

Message ID 20241025150226.896613-1-gpiccoli@igalia.com (mailing list archive)
State Changes Requested
Delegated to: Ping-Ke Shih
Headers show
Series wifi: rtlwifi: Drastically reduce the attempts to read efuse bytes in case of failures | expand

Commit Message

Guilherme G. Piccoli Oct. 25, 2024, 3:02 p.m. UTC
Syzkaller reported a hung task with uevent_show() on stack trace. That
specific issue was addressed by another commit [0], but even with that
fix applied (for example, running v6.12-rc4) we face another type of hung
task that comes from the same reproducer [1]. By investigating that, we
could narrow it to the following path:

(a) Syzkaller emulates a Realtek USB WiFi adapter using raw-gadget and
dummy_hcd infrastructure.

(b) During the probe of rtl8192cu, the driver ends-up performing an efuse
read procedure (which is related to EEPROM load IIUC), and here lies the
issue: the function read_efuse() calls read_efuse_byte() many times, as
loop iterations depending on the efuse size (in our example, 512 in total).

This procedure for reading efuse bytes relies in a loop that performs an
I/O read up to *10k* times in case of failures. We measured the time of
the loop inside read_efuse_byte() alone, and in this reproducer (which
involves the dummy_hcd emulation layer), it takes 15 seconds each. As a
consequence, we have the driver stuck in its probe routine for big time,
exposing a stack trace like below if we attempt to reboot the system, for
example:

task:kworker/0:3 state:D stack:0 pid:662 tgid:662 ppid:2 flags:0x00004000
Workqueue: usb_hub_wq hub_event
Call Trace:
 __schedule+0xe22/0xeb6
 schedule_timeout+0xe7/0x132
 __wait_for_common+0xb5/0x12e
 usb_start_wait_urb+0xc5/0x1ef
 ? usb_alloc_urb+0x95/0xa4
 usb_control_msg+0xff/0x184
 _usbctrl_vendorreq_sync+0xa0/0x161
 _usb_read_sync+0xb3/0xc5
 read_efuse_byte+0x13c/0x146
 read_efuse+0x351/0x5f0
 efuse_read_all_map+0x42/0x52
 rtl_efuse_shadow_map_update+0x60/0xef
 rtl_get_hwinfo+0x5d/0x1c2
 rtl92cu_read_eeprom_info+0x10a/0x8d5
 ? rtl92c_read_chip_version+0x14f/0x17e
 rtl_usb_probe+0x323/0x851
 usb_probe_interface+0x278/0x34b
 really_probe+0x202/0x4a4
 __driver_probe_device+0x166/0x1b2
 driver_probe_device+0x2f/0xd8
 [...]

We propose hereby to drastically reduce the attempts of doing the I/O read
in case of failures, from 10000 to 10. With that, we got reponsiveness in the
reproducer, while seems reasonable to believe that there's no sane device
implementation in the field requiring this amount of retries at every I/O
read in order to properly work. Based on that assumption it'd be good to
have it backported to stable but maybe not since driver implementation
(the 10k number comes from day 0), perhaps up to 6.x series makes sense.

[0] Commit 15fffc6a5624 ("driver core: Fix uevent_show() vs driver detach race").

[1] A note about that: this syzkaller report presents multiple reproducers
that differs by the type of emulated USB device. For this specific case,
check the entry from 2024/08/08 06:23 in the list of crashes; the C repro
is available at https://syzkaller.appspot.com/text?tag=ReproC&x=1521fc83980000.

Cc: stable@vger.kernel.org # v6.1+
Reported-by: syzbot+edd9fe0d3a65b14588d5@syzkaller.appspotmail.com
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
---
 drivers/net/wireless/realtek/rtlwifi/efuse.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Ping-Ke Shih Oct. 28, 2024, 1:44 a.m. UTC | #1
Guilherme G. Piccoli <gpiccoli@igalia.com> wrote:
> 
> This procedure for reading efuse bytes relies in a loop that performs an
> I/O read up to *10k* times in case of failures. We measured the time of
> the loop inside read_efuse_byte() alone, and in this reproducer (which
> involves the dummy_hcd emulation layer), it takes 15 seconds each.

The I/O read of 10k times is to polling if efuse is ready, and then following
statement is to actually read efuse content back. For USB devices, I/O is
slow, so it might be fine to reduce retry times. But For PCIE devices,
I think this will be risky without testing with real hardware.

Possible way is to use "rtlhal->interface == INTF_PCI" to keep original times
for PCIE devices, and only reduce retry times for USB devices. But USB can
operate on USB-2/-3 modes, so maybe still need experiments with real hardware 
to get reasonable retry times.
Guilherme G. Piccoli Oct. 28, 2024, 2:39 p.m. UTC | #2
On 27/10/2024 22:44, Ping-Ke Shih wrote:
> Guilherme G. Piccoli <gpiccoli@igalia.com> wrote:
>>
>> This procedure for reading efuse bytes relies in a loop that performs an
>> I/O read up to *10k* times in case of failures. We measured the time of
>> the loop inside read_efuse_byte() alone, and in this reproducer (which
>> involves the dummy_hcd emulation layer), it takes 15 seconds each.
> 
> The I/O read of 10k times is to polling if efuse is ready, and then following
> statement is to actually read efuse content back. For USB devices, I/O is
> slow, so it might be fine to reduce retry times. But For PCIE devices,
> I think this will be risky without testing with real hardware.
> 
> Possible way is to use "rtlhal->interface == INTF_PCI" to keep original times
> for PCIE devices, and only reduce retry times for USB devices. But USB can
> operate on USB-2/-3 modes, so maybe still need experiments with real hardware 
> to get reasonable retry times. 
> 

Thanks a bunch for the review and extra details Ping-Ke Shih!

The idea of guarding with "rtlhal->interface == INTF_PCI" is very good
and I can implement in a V2.

But can you help me on finding a USB adapter that runs this path? If you
know a commodity model that uses this specific driver, could you point
me so I can buy one for testing?

Meanwhile I'll try to find a model based on some kernel reports online,
hope I can!

Cheers,


Guilherme
Ping-Ke Shih Oct. 29, 2024, 12:50 a.m. UTC | #3
> 
> But can you help me on finding a USB adapter that runs this path? If you
> know a commodity model that uses this specific driver, could you point
> me so I can buy one for testing?
> 

I don't know that. Maybe, Bitterblue Smith (Cc'd) can share how/where he got
USB adapters.
Bitterblue Smith Oct. 29, 2024, 1:20 p.m. UTC | #4
On 29/10/2024 02:50, Ping-Ke Shih wrote:
>>
>> But can you help me on finding a USB adapter that runs this path? If you
>> know a commodity model that uses this specific driver, could you point
>> me so I can buy one for testing?
>>
> 
> I don't know that. Maybe, Bitterblue Smith (Cc'd) can share how/where he got
> USB adapters. 
> 
> 

I got them from Aliexpress. Both listings are gone now, but I still
see others:

https://www.aliexpress.com/item/1005007655660231.html
https://www.aliexpress.com/item/1005007688991958.html

Mine was only 6 USD in March 2023. I don't know why this obsolete
product got so expensive.

For RTL8192DU only modules are available:

https://www.aliexpress.com/item/4000191417711.html
https://www.aliexpress.com/item/1005007343563100.html

Someone gave me this link (I didn't buy):
https://www.amazon.com/Netis-Wireless-Raspberry-Windows-RTL8188CUS/dp/B008O2AL0K

Note that the Netis WF2120 can have newer chips inside which will
not use this driver.
Guilherme G. Piccoli Oct. 29, 2024, 1:31 p.m. UTC | #5
On 29/10/2024 10:20, Bitterblue Smith wrote:
> On 29/10/2024 02:50, Ping-Ke Shih wrote:
>>>
>>> But can you help me on finding a USB adapter that runs this path? If you
>>> know a commodity model that uses this specific driver, could you point
>>> me so I can buy one for testing?
>>>
>>
>> I don't know that. Maybe, Bitterblue Smith (Cc'd) can share how/where he got
>> USB adapters. 
>>
>>
> 
> I got them from Aliexpress. Both listings are gone now, but I still
> see others:
> 
> https://www.aliexpress.com/item/1005007655660231.html
> https://www.aliexpress.com/item/1005007688991958.html
> 
> Mine was only 6 USD in March 2023. I don't know why this obsolete
> product got so expensive.
> 
> For RTL8192DU only modules are available:
> 
> https://www.aliexpress.com/item/4000191417711.html
> https://www.aliexpress.com/item/1005007343563100.html
> 
> Someone gave me this link (I didn't buy):
> https://www.amazon.com/Netis-Wireless-Raspberry-Windows-RTL8188CUS/dp/B008O2AL0K
> 
> Note that the Netis WF2120 can have newer chips inside which will
> not use this driver.
> 

Thanks a bunch for the info and links, much appreciated! I can try to
grab one for testing, but let me ask also: would you be willing to test
that for me, Bitterblue? If so, I can resubmit today with the PCI check.

If not possible, no worries, I can buy one.
Cheers,


Guilherme
Bitterblue Smith Oct. 29, 2024, 4:55 p.m. UTC | #6
On 29/10/2024 15:31, Guilherme G. Piccoli wrote:
> On 29/10/2024 10:20, Bitterblue Smith wrote:
>> On 29/10/2024 02:50, Ping-Ke Shih wrote:
>>>>
>>>> But can you help me on finding a USB adapter that runs this path? If you
>>>> know a commodity model that uses this specific driver, could you point
>>>> me so I can buy one for testing?
>>>>
>>>
>>> I don't know that. Maybe, Bitterblue Smith (Cc'd) can share how/where he got
>>> USB adapters. 
>>>
>>>
>>
>> I got them from Aliexpress. Both listings are gone now, but I still
>> see others:
>>
>> https://www.aliexpress.com/item/1005007655660231.html
>> https://www.aliexpress.com/item/1005007688991958.html
>>
>> Mine was only 6 USD in March 2023. I don't know why this obsolete
>> product got so expensive.
>>
>> For RTL8192DU only modules are available:
>>
>> https://www.aliexpress.com/item/4000191417711.html
>> https://www.aliexpress.com/item/1005007343563100.html
>>
>> Someone gave me this link (I didn't buy):
>> https://www.amazon.com/Netis-Wireless-Raspberry-Windows-RTL8188CUS/dp/B008O2AL0K
>>
>> Note that the Netis WF2120 can have newer chips inside which will
>> not use this driver.
>>
> 
> Thanks a bunch for the info and links, much appreciated! I can try to
> grab one for testing, but let me ask also: would you be willing to test
> that for me, Bitterblue? If so, I can resubmit today with the PCI check.
> 
> If not possible, no worries, I can buy one.
> Cheers,
> 
> 
> Guilherme

I tested your patch. Both my devices can still read the efuse.
Guilherme G. Piccoli Oct. 29, 2024, 5:58 p.m. UTC | #7
On 29/10/2024 13:55, Bitterblue Smith wrote:
> [...]
> 
> I tested your patch. Both my devices can still read the efuse.
> 

This is super great, thank you! Would you like to have a Tested-by on
that? If so, just send the name / email of your preference and I can add
it to the V2. I'll implement the PCI guard and re-submit, thanks again!

Cheers,


Guilherme
Bitterblue Smith Oct. 30, 2024, 1:17 p.m. UTC | #8
On 29/10/2024 19:58, Guilherme G. Piccoli wrote:
> On 29/10/2024 13:55, Bitterblue Smith wrote:
>> [...]
>>
>> I tested your patch. Both my devices can still read the efuse.
>>
> 
> This is super great, thank you! Would you like to have a Tested-by on
> that? If so, just send the name / email of your preference and I can add
> it to the V2. I'll implement the PCI guard and re-submit, thanks again!
> 
> Cheers,
> 
> 
> Guilherme

Sure,

Tested-by: Bitterblue Smith <rtl8821cerfe2@gmail.com>
Guilherme G. Piccoli Oct. 30, 2024, 2:16 p.m. UTC | #9
On 30/10/2024 10:17, Bitterblue Smith wrote:
> [...]
> Sure,
> 
> Tested-by: Bitterblue Smith <rtl8821cerfe2@gmail.com>

Thanks a bunch, V2 just sent:
https://lore.kernel.org/r/20241030141440.1153887-1-gpiccoli@igalia.com/
diff mbox series

Patch

diff --git a/drivers/net/wireless/realtek/rtlwifi/efuse.c b/drivers/net/wireless/realtek/rtlwifi/efuse.c
index 82cf5fb5175f..2f75e376c0f6 100644
--- a/drivers/net/wireless/realtek/rtlwifi/efuse.c
+++ b/drivers/net/wireless/realtek/rtlwifi/efuse.c
@@ -178,7 +178,7 @@  void read_efuse_byte(struct ieee80211_hw *hw, u16 _offset, u8 *pbuf)
 
 	retry = 0;
 	value32 = rtl_read_dword(rtlpriv, rtlpriv->cfg->maps[EFUSE_CTRL]);
-	while (!(((value32 >> 24) & 0xff) & 0x80) && (retry < 10000)) {
+	while (!(((value32 >> 24) & 0xff) & 0x80) && (retry < 10)) {
 		value32 = rtl_read_dword(rtlpriv,
 					 rtlpriv->cfg->maps[EFUSE_CTRL]);
 		retry++;