Message ID | 20241022013855.284783-1-marex@denx.de (mailing list archive) |
---|---|
State | RFC |
Delegated to: | Kalle Valo |
Headers | show |
Series | wifi: wilc1000: Rework bus locking | expand |
Hi Marek, On 10/22/24 03:38, Marek Vasut wrote: > The bus locking in this driver is broken and produces subtle race > condition with ksdioirqd and its mmc_claim_host()/mmc_release_host() > usage in case of SDIO bus. Rework the locking to avoid this race > condition. > > The problem is the hif_cs mutex used in acquire_bus()/release_bus(), > which makes it look like calling acquire_bus() results in exclusive > access to the bus, but that is not true for SDIO bus. For SDIO bus, > to obtain exclusive access (any access, really), it is necessary to > call sdio_claim_host(), which is a wrapper around mmc_claim_host(), > which does its own locking. The acquire_bus() does not do that, but > the SDIO interface implementation does call sdio_claim_host() and > sdio_release_host() every single command, which is problematic. To > make things worse, wilc_sdio_interrupt() implementation called from > ksdioirqd first calls sdio_release_host(), then interrupt handling > and finally sdio_claim_host(). > > The core problem is that sdio_claim_host() cannot be done per command, > but has to be done per register/data IO which consists of multiple > commands. Is it really true ? What makes you say that we can not perform multiple operations under the same exclusive sdio section ? Usually the WILC register read/write consists of 3x CMD52 > to push in CSA pointer address and 1x CMD53 to read/write data to that > address. Most other accesses are also composed of multiple commands. > > Currently, if ksdioirqd wakes up and attempts to read SDIO_CCCR_INTx > to get pending SDIO IRQs in sdio_get_pending_irqs(), it can easily > perform that transfer between two consecutive CMD52 which are pushing > in the CSA pointer address and possibly disrupt the WILC operation. > This is undesired behavior. I agree about the observation, and then I disagree about the statement above on sdio_claim_host/sdio_release_host not meant to be used for multiple commands. I see plenty of sdio wireless drivers performing multiple sdio operations under the same sdio exclusive bus access section, either explicitely in their code, or through a sdio dedicated helper (eg: sdio_enable_func, sdio_disable_func). But more concerns below > > Rework the locking. > > Introduce new .hif_claim/.hif_release callbacks which implement bus > specific locking. Lock/unlock SDIO bus access using sdio_claim_host() > and sdio_release_host(), lock/unlock SPI bus access using the current > hif_cs mutex moved purely into the spi.c interface. Make acquire_bus() > and release_bus() call the .hif_claim/.hif_release() callbacks and do > not access the hif_cs mutex from there at all. > > Remove any SDIO bus locking used directly in commands and the broken > SDIO bus unlocking in wilc_sdio_interrupt(), this is no longer needed. > Fix up SDIO initialization code which newly needs sdio_claim_host() > and sdio_release_host(), since it cannot depend on the locking being > done per-command anymore. > > Signed-off-by: Marek Vasut <marex@denx.de> [...] > > -static void wilc_sdio_interrupt(struct sdio_func *func) > +static void wilc_sdio_claim(struct wilc *wilc) > +{ > + struct sdio_func *func = container_of(wilc->dev, struct sdio_func, dev); > + > + sdio_claim_host(func); > +} > + > +static void wilc_sdio_release(struct wilc *wilc) > { > + struct sdio_func *func = container_of(wilc->dev, struct sdio_func, dev); > + > sdio_release_host(func); > +} So with this series, we end up with some bus-specific operations needing some locking, but is now up to the upper layer to control this locking. This feels wrong. The driver has a dedicated sdio layer, so if we need some locking for sdio-specific operations, it should be handled autonomously by the sdio layer, right ? [...] > static int wilc_wlan_cfg_commit(struct wilc_vif *vif, int type, > diff --git a/drivers/net/wireless/microchip/wilc1000/wlan.h b/drivers/net/wireless/microchip/wilc1000/wlan.h > index b9e7f9222eadd..ade2db95e8a0f 100644 > --- a/drivers/net/wireless/microchip/wilc1000/wlan.h > +++ b/drivers/net/wireless/microchip/wilc1000/wlan.h > @@ -403,6 +403,8 @@ struct wilc_hif_func { > void (*disable_interrupt)(struct wilc *nic); > int (*hif_reset)(struct wilc *wilc); > bool (*hif_is_init)(struct wilc *wilc); > + void (*hif_claim)(struct wilc *wilc); > + void (*hif_release)(struct wilc *wilc); So IIUC, your series push the hif_cs lock into each bus layer of the driver, remove any explicit call to bus-specific locking mechanism from those layers, and makes the upper layer control the locking. As mentioned above, I don't understand why those layers can not manage the bus-specific locking by themselves (which would be a big win for the upper layer). For SDIO specifically, I feel like letting the layer handle those claim/release would even allow to remove this hif_cs mutex (but we may still need a lock for SPI side) But I may be missing something, so feel free to prove me wrong.
On 10/22/24 12:43 PM, Alexis Lothoré wrote: > Hi Marek, Hi, > On 10/22/24 03:38, Marek Vasut wrote: >> The bus locking in this driver is broken and produces subtle race >> condition with ksdioirqd and its mmc_claim_host()/mmc_release_host() >> usage in case of SDIO bus. Rework the locking to avoid this race >> condition. >> >> The problem is the hif_cs mutex used in acquire_bus()/release_bus(), >> which makes it look like calling acquire_bus() results in exclusive >> access to the bus, but that is not true for SDIO bus. For SDIO bus, >> to obtain exclusive access (any access, really), it is necessary to >> call sdio_claim_host(), which is a wrapper around mmc_claim_host(), >> which does its own locking. The acquire_bus() does not do that, but >> the SDIO interface implementation does call sdio_claim_host() and >> sdio_release_host() every single command, which is problematic. To >> make things worse, wilc_sdio_interrupt() implementation called from >> ksdioirqd first calls sdio_release_host(), then interrupt handling >> and finally sdio_claim_host(). >> >> The core problem is that sdio_claim_host() cannot be done per command, >> but has to be done per register/data IO which consists of multiple >> commands. > > Is it really true ? What makes you say that we can not perform multiple > operations under the same exclusive sdio section ? What I am trying to say is this: With current code, this can happen, which is not good, because transfers from multiple threads can be interleaved and interfere with each other: thread 1 thread2 do_some_higher_level_op() { ... read_register_0x3b0000() { claim_bus CMD52 0x00 release bus ksdioirqd() { claim_bus CMD52 0x0f, lets read SDIO_CCCR_INTx release_bus claim bus } CMD52 0x00 release_bus claim_bus CMD52 0x3b release_bus claim_bus CMD53 lets read data release_bus } ... } What should happen is either: thread 1 thread2 ksdioirqd() { // option 1 claim_bus CMD52 0x0f, lets read SDIO_CCCR_INTx release_bus } do_some_higher_level_op() { claim_bus ... read_register_0x3b0000 { CMD52 0x00 CMD52 0x00 CMD52 0x3b CMD53 lets read data } ... read_another_register() ... release_bus } ksdioirqd() { // option 2 claim_bus CMD52 0x0f, lets read SDIO_CCCR_INTx release_bus } That's what this patch implements, to avoid the interference. Maybe I should include the infographics? Or reword this somehow? > Usually the WILC register read/write consists of 3x CMD52 >> to push in CSA pointer address and 1x CMD53 to read/write data to that >> address. Most other accesses are also composed of multiple commands. >> >> Currently, if ksdioirqd wakes up and attempts to read SDIO_CCCR_INTx >> to get pending SDIO IRQs in sdio_get_pending_irqs(), it can easily >> perform that transfer between two consecutive CMD52 which are pushing >> in the CSA pointer address and possibly disrupt the WILC operation. >> This is undesired behavior. > > I agree about the observation, and then I disagree about the statement above on > sdio_claim_host/sdio_release_host not meant to be used for multiple commands. I think we have a misunderstanding here, see above. > I see plenty of sdio wireless drivers performing multiple sdio operations under > the same sdio exclusive bus access section, either explicitely in their code, or > through a sdio dedicated helper (eg: sdio_enable_func, sdio_disable_func). > > But more concerns below >> >> Rework the locking. >> >> Introduce new .hif_claim/.hif_release callbacks which implement bus >> specific locking. Lock/unlock SDIO bus access using sdio_claim_host() >> and sdio_release_host(), lock/unlock SPI bus access using the current >> hif_cs mutex moved purely into the spi.c interface. Make acquire_bus() >> and release_bus() call the .hif_claim/.hif_release() callbacks and do >> not access the hif_cs mutex from there at all. >> >> Remove any SDIO bus locking used directly in commands and the broken >> SDIO bus unlocking in wilc_sdio_interrupt(), this is no longer needed. >> Fix up SDIO initialization code which newly needs sdio_claim_host() >> and sdio_release_host(), since it cannot depend on the locking being >> done per-command anymore. >> >> Signed-off-by: Marek Vasut <marex@denx.de> > > [...] > >> >> -static void wilc_sdio_interrupt(struct sdio_func *func) >> +static void wilc_sdio_claim(struct wilc *wilc) >> +{ >> + struct sdio_func *func = container_of(wilc->dev, struct sdio_func, dev); >> + >> + sdio_claim_host(func); >> +} >> + >> +static void wilc_sdio_release(struct wilc *wilc) >> { >> + struct sdio_func *func = container_of(wilc->dev, struct sdio_func, dev); >> + >> sdio_release_host(func); >> +} > > So with this series, we end up with some bus-specific operations needing some > locking, but is now up to the upper layer to control this locking. This feels > wrong. It always was the upper layer (wlan.c) that attempted to do bus locking, except it was incomplete. The acquire_bus()/release_bus() primitives seems to be an attempt at serializing bus access across multiple operations (which really boils down to multiple SPI or SDIO commands). The problem is, acquire_bus()/release_bus() does not really work, because it does not prevent e.g. ksdioirqd from inserting a command on the SDIO bus. SDIO (or really, mmc framework) has its own way of doing bus locking, that's the sdio_claim_host()/sdio_release_host(), SPI does have spi_bus_lock()/spi_bus_unlock() which I should use in V2. Those sdio_claim_host()/sdio_release_host() and spi_bus_lock()/spi_bus_unlock() should be called in acquire_bus()/release_bus() to correctly serialize bus access across multiple operations. That will also eliminate the custom and not really functional hif_cs mutex. > The driver has a dedicated sdio layer, so if we need some locking for > sdio-specific operations, it should be handled autonomously by the sdio layer, > right ? Not quite, I don't think the intention was to let anything communicate with the WILC device within block "protected" by acquire_bus()/release_bus() pair. That's why I believe this is where bus lock and unlock should happen too. > [...] > >> static int wilc_wlan_cfg_commit(struct wilc_vif *vif, int type, >> diff --git a/drivers/net/wireless/microchip/wilc1000/wlan.h b/drivers/net/wireless/microchip/wilc1000/wlan.h >> index b9e7f9222eadd..ade2db95e8a0f 100644 >> --- a/drivers/net/wireless/microchip/wilc1000/wlan.h >> +++ b/drivers/net/wireless/microchip/wilc1000/wlan.h >> @@ -403,6 +403,8 @@ struct wilc_hif_func { >> void (*disable_interrupt)(struct wilc *nic); >> int (*hif_reset)(struct wilc *wilc); >> bool (*hif_is_init)(struct wilc *wilc); >> + void (*hif_claim)(struct wilc *wilc); >> + void (*hif_release)(struct wilc *wilc); > > So IIUC, your series push the hif_cs lock into each bus layer of the driver, > remove any explicit call to bus-specific locking mechanism from those layers, > and makes the upper layer control the locking. As mentioned above, I don't > understand why those layers can not manage the bus-specific locking by > themselves (which would be a big win for the upper layer). Because of acquire_bus()/release_bus() which I think is an attempt to serialize bus access across multiple complex operations (=commands sent to the card), see above. > For SDIO specifically, I feel like letting the layer handle those claim/release > would even allow to remove this hif_cs mutex (but we may still need a lock for > SPI side) > > But I may be missing something, so feel free to prove me wrong. With spi_bus_lock()/unlock() we can actually dispose of the hif_cs mutex altogether.
Hi Marek, kernel test robot noticed the following build errors: [auto build test ERROR on wireless-next/main] [also build test ERROR on next-20241022] [cannot apply to wireless/main linus/master v6.12-rc4] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Marek-Vasut/wifi-wilc1000-Rework-bus-locking/20241022-093954 base: https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next.git main patch link: https://lore.kernel.org/r/20241022013855.284783-1-marex%40denx.de patch subject: [PATCH] wifi: wilc1000: Rework bus locking config: x86_64-allyesconfig (https://download.01.org/0day-ci/archive/20241023/202410230402.Cgu8obYR-lkp@intel.com/config) compiler: clang version 18.1.8 (https://github.com/llvm/llvm-project 3b5b5c1ec4a3095ab096dd780e84d7ab81f3d7ff) reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241023/202410230402.Cgu8obYR-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202410230402.Cgu8obYR-lkp@intel.com/ All errors (new ones prefixed by >>): >> drivers/net/wireless/microchip/wilc1000/spi.c:1113:20: error: no member named 'hif_cs' in 'struct wilc' 1113 | mutex_lock(&wilc->hif_cs); | ~~~~ ^ include/linux/mutex.h:166:44: note: expanded from macro 'mutex_lock' 166 | #define mutex_lock(lock) mutex_lock_nested(lock, 0) | ^~~~ drivers/net/wireless/microchip/wilc1000/spi.c:1118:22: error: no member named 'hif_cs' in 'struct wilc' 1118 | mutex_unlock(&wilc->hif_cs); | ~~~~ ^ drivers/net/wireless/microchip/wilc1000/spi.c:1147:23: error: no member named 'hif_cs' in 'struct wilc' 1147 | mutex_destroy(&wilc->hif_cs); | ~~~~ ^ >> drivers/net/wireless/microchip/wilc1000/spi.c:1156:14: error: use of undeclared identifier 'spi' 1156 | mutex_init(&spi->hif_cs); | ^ >> drivers/net/wireless/microchip/wilc1000/spi.c:1156:14: error: use of undeclared identifier 'spi' 5 errors generated. Kconfig warnings: (for reference only) WARNING: unmet direct dependencies detected for MODVERSIONS Depends on [n]: MODULES [=y] && !COMPILE_TEST [=y] Selected by [y]: - RANDSTRUCT_FULL [=y] && (CC_HAS_RANDSTRUCT [=y] || GCC_PLUGINS [=n]) && MODULES [=y] vim +1113 drivers/net/wireless/microchip/wilc1000/spi.c 1105 1106 /******************************************** 1107 * 1108 * Bus interfaces 1109 * 1110 ********************************************/ 1111 static void wilc_spi_claim(struct wilc *wilc) 1112 { > 1113 mutex_lock(&wilc->hif_cs); 1114 } 1115 1116 static void wilc_spi_release(struct wilc *wilc) 1117 { > 1118 mutex_unlock(&wilc->hif_cs); 1119 } 1120 1121 static int wilc_spi_reset(struct wilc *wilc) 1122 { 1123 struct spi_device *spi = to_spi_device(wilc->dev); 1124 struct wilc_spi *spi_priv = wilc->bus_data; 1125 int result; 1126 1127 result = wilc_spi_special_cmd(wilc, CMD_RESET); 1128 if (result && !spi_priv->probing_crc) 1129 dev_err(&spi->dev, "Failed cmd reset\n"); 1130 1131 return result; 1132 } 1133 1134 static bool wilc_spi_is_init(struct wilc *wilc) 1135 { 1136 struct wilc_spi *spi_priv = wilc->bus_data; 1137 1138 return spi_priv->isinit; 1139 } 1140 1141 static int wilc_spi_deinit(struct wilc *wilc) 1142 { 1143 struct wilc_spi *spi_priv = wilc->bus_data; 1144 1145 spi_priv->isinit = false; 1146 wilc_wlan_power(wilc, false); 1147 mutex_destroy(&wilc->hif_cs); 1148 return 0; 1149 } 1150 1151 static int wilc_spi_init(struct wilc *wilc, bool resume) 1152 { 1153 struct wilc_spi *spi_priv = wilc->bus_data; 1154 int ret; 1155 > 1156 mutex_init(&spi->hif_cs); 1157 1158 if (spi_priv->isinit) { 1159 /* Confirm we can read chipid register without error: */ 1160 if (wilc_validate_chipid(wilc) == 0) 1161 return 0; 1162 } 1163 1164 wilc_wlan_power(wilc, true); 1165 1166 ret = wilc_spi_configure_bus_protocol(wilc); 1167 if (ret) { 1168 wilc_wlan_power(wilc, false); 1169 return ret; 1170 } 1171 1172 spi_priv->isinit = true; 1173 1174 return 0; 1175 } 1176
Hi Marek, kernel test robot noticed the following build errors: [auto build test ERROR on wireless-next/main] [also build test ERROR on next-20241022] [cannot apply to wireless/main linus/master v6.12-rc4] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Marek-Vasut/wifi-wilc1000-Rework-bus-locking/20241022-093954 base: https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next.git main patch link: https://lore.kernel.org/r/20241022013855.284783-1-marex%40denx.de patch subject: [PATCH] wifi: wilc1000: Rework bus locking config: loongarch-allmodconfig (https://download.01.org/0day-ci/archive/20241023/202410230937.zWSJkSuE-lkp@intel.com/config) compiler: loongarch64-linux-gcc (GCC) 14.1.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241023/202410230937.zWSJkSuE-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202410230937.zWSJkSuE-lkp@intel.com/ All error/warnings (new ones prefixed by >>): In file included from include/linux/notifier.h:14, from include/linux/clk.h:14, from drivers/net/wireless/microchip/wilc1000/spi.c:7: drivers/net/wireless/microchip/wilc1000/spi.c: In function 'wilc_spi_claim': >> drivers/net/wireless/microchip/wilc1000/spi.c:1113:25: error: 'struct wilc' has no member named 'hif_cs' 1113 | mutex_lock(&wilc->hif_cs); | ^~ include/linux/mutex.h:166:44: note: in definition of macro 'mutex_lock' 166 | #define mutex_lock(lock) mutex_lock_nested(lock, 0) | ^~~~ drivers/net/wireless/microchip/wilc1000/spi.c: In function 'wilc_spi_release': drivers/net/wireless/microchip/wilc1000/spi.c:1118:27: error: 'struct wilc' has no member named 'hif_cs' 1118 | mutex_unlock(&wilc->hif_cs); | ^~ drivers/net/wireless/microchip/wilc1000/spi.c: In function 'wilc_spi_deinit': drivers/net/wireless/microchip/wilc1000/spi.c:1147:28: error: 'struct wilc' has no member named 'hif_cs' 1147 | mutex_destroy(&wilc->hif_cs); | ^~ drivers/net/wireless/microchip/wilc1000/spi.c: In function 'wilc_spi_init': >> drivers/net/wireless/microchip/wilc1000/spi.c:1156:21: error: 'spi' undeclared (first use in this function) 1156 | mutex_init(&spi->hif_cs); | ^~~ include/linux/mutex.h:64:23: note: in definition of macro 'mutex_init' 64 | __mutex_init((mutex), #mutex, &__key); \ | ^~~~~ drivers/net/wireless/microchip/wilc1000/spi.c:1156:21: note: each undeclared identifier is reported only once for each function it appears in 1156 | mutex_init(&spi->hif_cs); | ^~~ include/linux/mutex.h:64:23: note: in definition of macro 'mutex_init' 64 | __mutex_init((mutex), #mutex, &__key); \ | ^~~~~ drivers/net/wireless/microchip/wilc1000/spi.c: At top level: >> drivers/net/wireless/microchip/wilc1000/spi.c:1116:13: warning: 'wilc_spi_release' defined but not used [-Wunused-function] 1116 | static void wilc_spi_release(struct wilc *wilc) | ^~~~~~~~~~~~~~~~ >> drivers/net/wireless/microchip/wilc1000/spi.c:1111:13: warning: 'wilc_spi_claim' defined but not used [-Wunused-function] 1111 | static void wilc_spi_claim(struct wilc *wilc) | ^~~~~~~~~~~~~~ vim +1113 drivers/net/wireless/microchip/wilc1000/spi.c 1105 1106 /******************************************** 1107 * 1108 * Bus interfaces 1109 * 1110 ********************************************/ > 1111 static void wilc_spi_claim(struct wilc *wilc) 1112 { > 1113 mutex_lock(&wilc->hif_cs); 1114 } 1115 > 1116 static void wilc_spi_release(struct wilc *wilc) 1117 { 1118 mutex_unlock(&wilc->hif_cs); 1119 } 1120 1121 static int wilc_spi_reset(struct wilc *wilc) 1122 { 1123 struct spi_device *spi = to_spi_device(wilc->dev); 1124 struct wilc_spi *spi_priv = wilc->bus_data; 1125 int result; 1126 1127 result = wilc_spi_special_cmd(wilc, CMD_RESET); 1128 if (result && !spi_priv->probing_crc) 1129 dev_err(&spi->dev, "Failed cmd reset\n"); 1130 1131 return result; 1132 } 1133 1134 static bool wilc_spi_is_init(struct wilc *wilc) 1135 { 1136 struct wilc_spi *spi_priv = wilc->bus_data; 1137 1138 return spi_priv->isinit; 1139 } 1140 1141 static int wilc_spi_deinit(struct wilc *wilc) 1142 { 1143 struct wilc_spi *spi_priv = wilc->bus_data; 1144 1145 spi_priv->isinit = false; 1146 wilc_wlan_power(wilc, false); 1147 mutex_destroy(&wilc->hif_cs); 1148 return 0; 1149 } 1150 1151 static int wilc_spi_init(struct wilc *wilc, bool resume) 1152 { 1153 struct wilc_spi *spi_priv = wilc->bus_data; 1154 int ret; 1155 > 1156 mutex_init(&spi->hif_cs); 1157 1158 if (spi_priv->isinit) { 1159 /* Confirm we can read chipid register without error: */ 1160 if (wilc_validate_chipid(wilc) == 0) 1161 return 0; 1162 } 1163 1164 wilc_wlan_power(wilc, true); 1165 1166 ret = wilc_spi_configure_bus_protocol(wilc); 1167 if (ret) { 1168 wilc_wlan_power(wilc, false); 1169 return ret; 1170 } 1171 1172 spi_priv->isinit = true; 1173 1174 return 0; 1175 } 1176
Hi, On 10/22/24 06:19, Marek Vasut wrote: > EXTERNAL EMAIL: Do not click links or open attachments unless you know the > content is safe > > On 10/22/24 12:43 PM, Alexis Lothoré wrote: >> Hi Marek, > > Hi, > >> On 10/22/24 03:38, Marek Vasut wrote: >>> The bus locking in this driver is broken and produces subtle race >>> condition with ksdioirqd and its mmc_claim_host()/mmc_release_host() >>> usage in case of SDIO bus. Rework the locking to avoid this race >>> condition. >>> >>> The problem is the hif_cs mutex used in acquire_bus()/release_bus(), >>> which makes it look like calling acquire_bus() results in exclusive >>> access to the bus, but that is not true for SDIO bus. For SDIO bus, >>> to obtain exclusive access (any access, really), it is necessary to >>> call sdio_claim_host(), which is a wrapper around mmc_claim_host(), >>> which does its own locking. The acquire_bus() does not do that, but >>> the SDIO interface implementation does call sdio_claim_host() and >>> sdio_release_host() every single command, which is problematic. To >>> make things worse, wilc_sdio_interrupt() implementation called from >>> ksdioirqd first calls sdio_release_host(), then interrupt handling >>> and finally sdio_claim_host(). >>> >>> The core problem is that sdio_claim_host() cannot be done per command, >>> but has to be done per register/data IO which consists of multiple >>> commands. >> >> Is it really true ? What makes you say that we can not perform multiple >> operations under the same exclusive sdio section ? > > What I am trying to say is this: > > With current code, this can happen, which is not good, because transfers > from multiple threads can be interleaved and interfere with each other: Did you observe any bus failure in your test setup with SDIO. Is there any configure to recreate it. The SDIO transfer may appear to be split into into multiple transaction but these calls should not impact each other since most of them are updating the different registers except WILC_SDIO_FBR_CSA_REG register, which is used for CSA read/write. If needed, wilc_sdio_set_func0_csa_address() can be modified to club the 3x CMD52 and 1x CMD53 into a single transfer API. In my opinion, If sdio_claim_host() is moved to acquire_bus() API then the SDIO bus will be claimed longer than required especially in wilc_wlan_handle_txq() API. Ideally, calling sdio_claim_host() call just before the transfer is enough but now the SDIO I/O bus would be continuously blocked for multiple independent SDIO transactions that is not necessary. > > thread 1 thread2 > do_some_higher_level_op() { > ... > read_register_0x3b0000() { > claim_bus > CMD52 0x00 > release bus ksdioirqd() { > claim_bus > CMD52 0x0f, lets read SDIO_CCCR_INTx > release_bus > claim bus } > CMD52 0x00 > release_bus > claim_bus > CMD52 0x3b > release_bus > claim_bus > CMD53 lets read data > release_bus > } > ... > } > > What should happen is either: > > thread 1 thread2 > ksdioirqd() { // option 1 > claim_bus > CMD52 0x0f, lets read SDIO_CCCR_INTx > release_bus > } > do_some_higher_level_op() { > claim_bus > ... > read_register_0x3b0000 { > CMD52 0x00 > CMD52 0x00 > CMD52 0x3b > CMD53 lets read data > } > ... > read_another_register() > ... > release_bus > } > ksdioirqd() { // option 2 > claim_bus > CMD52 0x0f, lets read SDIO_CCCR_INTx > release_bus > } > > That's what this patch implements, to avoid the interference. > > Maybe I should include the infographics? Or reword this somehow? > >> Usually the WILC register read/write consists of 3x CMD52 >>> to push in CSA pointer address and 1x CMD53 to read/write data to that >>> address. Most other accesses are also composed of multiple commands. >>> >>> Currently, if ksdioirqd wakes up and attempts to read SDIO_CCCR_INTx >>> to get pending SDIO IRQs in sdio_get_pending_irqs(), it can easily >>> perform that transfer between two consecutive CMD52 which are pushing >>> in the CSA pointer address and possibly disrupt the WILC operation. >>> This is undesired behavior. >> >> I agree about the observation, and then I disagree about the statement above on >> sdio_claim_host/sdio_release_host not meant to be used for multiple commands. > > I think we have a misunderstanding here, see above. > >> I see plenty of sdio wireless drivers performing multiple sdio operations under >> the same sdio exclusive bus access section, either explicitely in their >> code, or >> through a sdio dedicated helper (eg: sdio_enable_func, sdio_disable_func). >> >> But more concerns below >>> >>> Rework the locking. >>> >>> Introduce new .hif_claim/.hif_release callbacks which implement bus >>> specific locking. Lock/unlock SDIO bus access using sdio_claim_host() >>> and sdio_release_host(), lock/unlock SPI bus access using the current >>> hif_cs mutex moved purely into the spi.c interface. Make acquire_bus() >>> and release_bus() call the .hif_claim/.hif_release() callbacks and do >>> not access the hif_cs mutex from there at all. >>> >>> Remove any SDIO bus locking used directly in commands and the broken >>> SDIO bus unlocking in wilc_sdio_interrupt(), this is no longer needed. >>> Fix up SDIO initialization code which newly needs sdio_claim_host() >>> and sdio_release_host(), since it cannot depend on the locking being >>> done per-command anymore. >>> >>> Signed-off-by: Marek Vasut <marex@denx.de> >> >> [...] >> >>> >>> -static void wilc_sdio_interrupt(struct sdio_func *func) >>> +static void wilc_sdio_claim(struct wilc *wilc) >>> +{ >>> + struct sdio_func *func = container_of(wilc->dev, struct sdio_func, dev); >>> + >>> + sdio_claim_host(func); >>> +} >>> + >>> +static void wilc_sdio_release(struct wilc *wilc) >>> { >>> + struct sdio_func *func = container_of(wilc->dev, struct sdio_func, dev); >>> + >>> sdio_release_host(func); >>> +} >> >> So with this series, we end up with some bus-specific operations needing some >> locking, but is now up to the upper layer to control this locking. This feels >> wrong. > > It always was the upper layer (wlan.c) that attempted to do bus locking, > except it was incomplete. The acquire_bus()/release_bus() primitives > seems to be an attempt at serializing bus access across multiple > operations (which really boils down to multiple SPI or SDIO commands). > > The problem is, acquire_bus()/release_bus() does not really work, > because it does not prevent e.g. ksdioirqd from inserting a command on > the SDIO bus. SDIO (or really, mmc framework) has its own way of doing > bus locking, that's the sdio_claim_host()/sdio_release_host(), SPI does > have spi_bus_lock()/spi_bus_unlock() which I should use in V2. > > Those sdio_claim_host()/sdio_release_host() and > spi_bus_lock()/spi_bus_unlock() should be called in > acquire_bus()/release_bus() to correctly serialize bus access across > multiple operations. That will also eliminate the custom and not really > functional hif_cs mutex. > >> The driver has a dedicated sdio layer, so if we need some locking for >> sdio-specific operations, it should be handled autonomously by the sdio layer, >> right ? > > Not quite, I don't think the intention was to let anything communicate > with the WILC device within block "protected" by > acquire_bus()/release_bus() pair. That's why I believe this is where bus > lock and unlock should happen too. > >> [...] >> >>> static int wilc_wlan_cfg_commit(struct wilc_vif *vif, int type, >>> diff --git a/drivers/net/wireless/microchip/wilc1000/wlan.h >>> b/drivers/net/wireless/microchip/wilc1000/wlan.h >>> index b9e7f9222eadd..ade2db95e8a0f 100644 >>> --- a/drivers/net/wireless/microchip/wilc1000/wlan.h >>> +++ b/drivers/net/wireless/microchip/wilc1000/wlan.h >>> @@ -403,6 +403,8 @@ struct wilc_hif_func { >>> void (*disable_interrupt)(struct wilc *nic); >>> int (*hif_reset)(struct wilc *wilc); >>> bool (*hif_is_init)(struct wilc *wilc); >>> + void (*hif_claim)(struct wilc *wilc); >>> + void (*hif_release)(struct wilc *wilc); >> >> So IIUC, your series push the hif_cs lock into each bus layer of the driver, >> remove any explicit call to bus-specific locking mechanism from those layers, >> and makes the upper layer control the locking. As mentioned above, I don't >> understand why those layers can not manage the bus-specific locking by >> themselves (which would be a big win for the upper layer). > > Because of acquire_bus()/release_bus() which I think is an attempt to > serialize bus access across multiple complex operations (=commands sent > to the card), see above. > >> For SDIO specifically, I feel like letting the layer handle those claim/release >> would even allow to remove this hif_cs mutex (but we may still need a lock for >> SPI side) >> >> But I may be missing something, so feel free to prove me wrong. > With spi_bus_lock()/unlock() we can actually dispose of the hif_cs mutex > altogether.
Hello Marek, On 10/22/24 15:19, Marek Vasut wrote: > On 10/22/24 12:43 PM, Alexis Lothoré wrote: >> Hi Marek, > > Hi, > >> On 10/22/24 03:38, Marek Vasut wrote: >>> The bus locking in this driver is broken and produces subtle race >>> condition with ksdioirqd and its mmc_claim_host()/mmc_release_host() >>> usage in case of SDIO bus. Rework the locking to avoid this race >>> condition. >>> >>> The problem is the hif_cs mutex used in acquire_bus()/release_bus(), >>> which makes it look like calling acquire_bus() results in exclusive >>> access to the bus, but that is not true for SDIO bus. For SDIO bus, >>> to obtain exclusive access (any access, really), it is necessary to >>> call sdio_claim_host(), which is a wrapper around mmc_claim_host(), >>> which does its own locking. The acquire_bus() does not do that, but >>> the SDIO interface implementation does call sdio_claim_host() and >>> sdio_release_host() every single command, which is problematic. To >>> make things worse, wilc_sdio_interrupt() implementation called from >>> ksdioirqd first calls sdio_release_host(), then interrupt handling >>> and finally sdio_claim_host(). >>> >>> The core problem is that sdio_claim_host() cannot be done per command, >>> but has to be done per register/data IO which consists of multiple >>> commands. >> >> Is it really true ? What makes you say that we can not perform multiple >> operations under the same exclusive sdio section ? > > What I am trying to say is this: > > With current code, this can happen, which is not good, because transfers from > multiple threads can be interleaved and interfere with each other: > > thread 1 thread2 > do_some_higher_level_op() { > ... > read_register_0x3b0000() { > claim_bus > CMD52 0x00 > release bus ksdioirqd() { > claim_bus > CMD52 0x0f, lets read SDIO_CCCR_INTx > release_bus > claim bus } > CMD52 0x00 > release_bus > claim_bus > CMD52 0x3b > release_bus > claim_bus > CMD53 lets read data > release_bus > } > ... > } > > What should happen is either: > > thread 1 thread2 > ksdioirqd() { // option 1 > claim_bus > CMD52 0x0f, lets read SDIO_CCCR_INTx > release_bus > } > do_some_higher_level_op() { > claim_bus > ... > read_register_0x3b0000 { > CMD52 0x00 > CMD52 0x00 > CMD52 0x3b > CMD53 lets read data > } > ... > read_another_register() > ... > release_bus > } > ksdioirqd() { // option 2 > claim_bus > CMD52 0x0f, lets read SDIO_CCCR_INTx > release_bus > } > > That's what this patch implements, to avoid the interference. > > Maybe I should include the infographics? Or reword this somehow? What I may have misunderstood is your first sentence ("sdio_claim_host() cannot be done per command, but has to be done per register/data IO which consists of multiple commands", especially command VS reg/data io), but your graph clarified it for me, thanks, so in the end we agree on this :) That may just be me having poorly interpreted, so no need to add the graphs to the commit [...] >>> static int wilc_wlan_cfg_commit(struct wilc_vif *vif, int type, >>> diff --git a/drivers/net/wireless/microchip/wilc1000/wlan.h b/drivers/net/ >>> wireless/microchip/wilc1000/wlan.h >>> index b9e7f9222eadd..ade2db95e8a0f 100644 >>> --- a/drivers/net/wireless/microchip/wilc1000/wlan.h >>> +++ b/drivers/net/wireless/microchip/wilc1000/wlan.h >>> @@ -403,6 +403,8 @@ struct wilc_hif_func { >>> void (*disable_interrupt)(struct wilc *nic); >>> int (*hif_reset)(struct wilc *wilc); >>> bool (*hif_is_init)(struct wilc *wilc); >>> + void (*hif_claim)(struct wilc *wilc); >>> + void (*hif_release)(struct wilc *wilc); >> >> So IIUC, your series push the hif_cs lock into each bus layer of the driver, >> remove any explicit call to bus-specific locking mechanism from those layers, >> and makes the upper layer control the locking. As mentioned above, I don't >> understand why those layers can not manage the bus-specific locking by >> themselves (which would be a big win for the upper layer). > > Because of acquire_bus()/release_bus() which I think is an attempt to serialize > bus access across multiple complex operations (=commands sent to the card), see > above. Taking a further look at some examples in the driver, I see that indeed the "scope" of acquire_bus/release_bus is larger than simple bus operations. So I withdraw my proposal which was wrong. Thanks, Alexis
On 10/23/24 9:54 AM, Alexis Lothoré wrote: Hello Alexis, >> ksdioirqd() { // option 2 >> claim_bus >> CMD52 0x0f, lets read SDIO_CCCR_INTx >> release_bus >> } >> >> That's what this patch implements, to avoid the interference. >> >> Maybe I should include the infographics? Or reword this somehow? > > What I may have misunderstood is your first sentence ("sdio_claim_host() cannot > be done per command, but has to be done per register/data IO which consists of > multiple commands", especially command VS reg/data io), but your graph clarified > it for me, thanks, so in the end we agree on this :) That may just be me having > poorly interpreted, so no need to add the graphs to the commit You're welcome. As long as we can understand each other with one extra round trip, all is good :) > [...] > >>>> static int wilc_wlan_cfg_commit(struct wilc_vif *vif, int type, >>>> diff --git a/drivers/net/wireless/microchip/wilc1000/wlan.h b/drivers/net/ >>>> wireless/microchip/wilc1000/wlan.h >>>> index b9e7f9222eadd..ade2db95e8a0f 100644 >>>> --- a/drivers/net/wireless/microchip/wilc1000/wlan.h >>>> +++ b/drivers/net/wireless/microchip/wilc1000/wlan.h >>>> @@ -403,6 +403,8 @@ struct wilc_hif_func { >>>> void (*disable_interrupt)(struct wilc *nic); >>>> int (*hif_reset)(struct wilc *wilc); >>>> bool (*hif_is_init)(struct wilc *wilc); >>>> + void (*hif_claim)(struct wilc *wilc); >>>> + void (*hif_release)(struct wilc *wilc); >>> >>> So IIUC, your series push the hif_cs lock into each bus layer of the driver, >>> remove any explicit call to bus-specific locking mechanism from those layers, >>> and makes the upper layer control the locking. As mentioned above, I don't >>> understand why those layers can not manage the bus-specific locking by >>> themselves (which would be a big win for the upper layer). >> >> Because of acquire_bus()/release_bus() which I think is an attempt to serialize >> bus access across multiple complex operations (=commands sent to the card), see >> above. > > Taking a further look at some examples in the driver, I see that indeed the > "scope" of acquire_bus/release_bus is larger than simple bus operations. So I > withdraw my proposal which was wrong. All right.
On 10/23/24 9:17 AM, Ajay.Kathat@microchip.com wrote: Hello Ajay, >> What I am trying to say is this: >> >> With current code, this can happen, which is not good, because transfers >> from multiple threads can be interleaved and interfere with each other: > > Did you observe any bus failure in your test setup with SDIO. Is there any > configure to recreate it. I am observing sporadic command and data CRC errors on STM32MP157F system with SDIO WILC3000. > The SDIO transfer may appear to be split into into multiple transaction but > these calls should not impact each other since most of them are updating the > different registers except WILC_SDIO_FBR_CSA_REG register, which is used for > CSA read/write. If needed, wilc_sdio_set_func0_csa_address() can be modified > to club the 3x CMD52 and 1x CMD53 into a single transfer API. > > In my opinion, If sdio_claim_host() is moved to acquire_bus() API then the > SDIO bus will be claimed longer than required especially in > wilc_wlan_handle_txq() API. Ideally, calling sdio_claim_host() call just > before the transfer is enough but now the SDIO I/O bus would be continuously > blocked for multiple independent SDIO transactions that is not necessary. Why would that pose a problem ? I am more concerned that ksdioirqd can insert a command transfer right in the middle of CMD52/CMD53 register read composite transfer, because while ksdioirqd does use proper sdio_claim/release_host, this driver does it per-SDIO-command instead of per the whole e.g. register read "transaction". [...]
Hello Marek, On 10/23/24 07:44, Marek Vasut wrote: > EXTERNAL EMAIL: Do not click links or open attachments unless you know the > content is safe > > On 10/23/24 9:17 AM, Ajay.Kathat@microchip.com wrote: > > Hello Ajay, > >>> What I am trying to say is this: >>> >>> With current code, this can happen, which is not good, because transfers >>> from multiple threads can be interleaved and interfere with each other: >> >> Did you observe any bus failure in your test setup with SDIO. Is there any >> configure to recreate it. > > I am observing sporadic command and data CRC errors on STM32MP157F > system with SDIO WILC3000. Does this patch help to resolve the CRC errors? Do you observe the CRC errors during the bulk data transfer(iPerf) or does it even happen during the basic WiFi operation like(scan/connect or basic ping operation). Is power-save enabled during the test. With PS enabled, The SDIO commands may fail momentarily but it should recover. > >> The SDIO transfer may appear to be split into into multiple transaction but >> these calls should not impact each other since most of them are updating the >> different registers except WILC_SDIO_FBR_CSA_REG register, which is used for >> CSA read/write. If needed, wilc_sdio_set_func0_csa_address() can be modified >> to club the 3x CMD52 and 1x CMD53 into a single transfer API. >> >> In my opinion, If sdio_claim_host() is moved to acquire_bus() API then the >> SDIO bus will be claimed longer than required especially in >> wilc_wlan_handle_txq() API. Ideally, calling sdio_claim_host() call just >> before the transfer is enough but now the SDIO I/O bus would be continuously >> blocked for multiple independent SDIO transactions that is not necessary. > > Why would that pose a problem ? wilc_wlan_handle_txq() performs many operations on different registers which are not related. It will block the SDIO bux for longer. Otherwise those registers are allowed to update separately from the WILC SD side. > > I am more concerned that ksdioirqd can insert a command transfer right > in the middle of CMD52/CMD53 register read composite transfer, because > while ksdioirqd does use proper sdio_claim/release_host, this driver > does it per-SDIO-command instead of per the whole e.g. register read > "transaction". > I think, using sdio_claim/release for each-SDIO command should suffice because the CMD52/CMD53 modify the specific registers that are unrelated. Each CMD52/53 should work properly and independently provided they are protected with sdio_claim/release. Additionally, there is no WILC SD module limitation requiring a strict order for CMD52/CMD53, except for Card Storage Area (CSA) read/write operations. For CSA read/write operations, which are necessary to read/write any specific address from the card, multiple CMD52 commands are used to pass the desired address to be read/written using registers (0x10c, 0x10d, 0x10e). Then, CMD53 is used to read/write to address 0x10f (CSA data window register). This complete command sequence is required for a single CSA address read/write. Based on this requirement, CSA read/write operations can cause issues if they run in parallel with another CSA read/write operation, but not with other CMD52 and CMD53 commands. Therefore, one approach to resolve this issue is to add sdio_claim/release around wilc_sdio_set_func0_csa_address(). This way, this function will be treated as a single operation and it will only modify the required command flow. Regards, Ajay
On 10/23/24 7:54 PM, Ajay.Kathat@microchip.com wrote: > Hello Marek, Hi, >>>> What I am trying to say is this: >>>> >>>> With current code, this can happen, which is not good, because transfers >>>> from multiple threads can be interleaved and interfere with each other: >>> >>> Did you observe any bus failure in your test setup with SDIO. Is there any >>> configure to recreate it. >> >> I am observing sporadic command and data CRC errors on STM32MP157F >> system with SDIO WILC3000. > > Does this patch help to resolve the CRC errors? No > Do you observe the CRC errors during the bulk data transfer(iPerf) or does it > even happen during the basic WiFi operation like(scan/connect or basic ping > operation). I can trigger them during this test: $ while true ; ifconfig wlan0 up ; ifconfig wlan0 down ; done But they happen sporadically and seemingly at random. I already stuffed the driver with trace_printk()s through and through, but the trace log does not indicate anything that would be off in any way. > Is power-save enabled during the test. With PS enabled, The SDIO commands may > fail momentarily but it should recover. It seems it gets enabled after first ifconfig up, that's a good hint, I'll try to disable it and see if that makes them errors go away. Thanks! Do you have any details on WHY would such sporadic errors occur and how to make those go away even with PS enabled ? >>> The SDIO transfer may appear to be split into into multiple transaction but >>> these calls should not impact each other since most of them are updating the >>> different registers except WILC_SDIO_FBR_CSA_REG register, which is used for >>> CSA read/write. If needed, wilc_sdio_set_func0_csa_address() can be modified >>> to club the 3x CMD52 and 1x CMD53 into a single transfer API. >>> >>> In my opinion, If sdio_claim_host() is moved to acquire_bus() API then the >>> SDIO bus will be claimed longer than required especially in >>> wilc_wlan_handle_txq() API. Ideally, calling sdio_claim_host() call just >>> before the transfer is enough but now the SDIO I/O bus would be continuously >>> blocked for multiple independent SDIO transactions that is not necessary. >> >> Why would that pose a problem ? > > wilc_wlan_handle_txq() performs many operations on different registers which > are not related. It will block the SDIO bux for longer. Otherwise those > registers are allowed to update separately from the WILC SD side. And is that blocking of SD bus actually a problem ? >> I am more concerned that ksdioirqd can insert a command transfer right >> in the middle of CMD52/CMD53 register read composite transfer, because >> while ksdioirqd does use proper sdio_claim/release_host, this driver >> does it per-SDIO-command instead of per the whole e.g. register read >> "transaction". >> > > I think, using sdio_claim/release for each-SDIO command should suffice because > the CMD52/CMD53 modify the specific registers that are unrelated. Each > CMD52/53 should work properly and independently provided they are protected > with sdio_claim/release. > Additionally, there is no WILC SD module limitation requiring a strict order > for CMD52/CMD53, except for Card Storage Area (CSA) read/write operations. Can multiple register IO operations like that be interleaved with ksdioirqd access to SDIO_CCCR_INTx too ? Take e.g. wilc_wlan_handle_txq(), is the following order legal? thread1 | thread2 ret = func->hif_write_reg(wilc, | WILC_CORTUS_INTERRUPT_BASE, | 1); | | ksdioirqd { | ret = mmc_io_rw_direct(card, 0, | 0, SDIO_CCCR_INTx, 0, pending); | } ret = func->hif_read_reg(wilc, | WILC_CORTUS_INTERRUPT_BASE, | ®); | > For CSA read/write operations, which are necessary to read/write any specific > address from the card, multiple CMD52 commands are used to pass the desired > address to be read/written using registers (0x10c, 0x10d, 0x10e). Then, CMD53 > is used to read/write to address 0x10f (CSA data window register). This > complete command sequence is required for a single CSA address read/write. > Based on this requirement, CSA read/write operations can cause issues if they > run in parallel with another CSA read/write operation, but not with other > CMD52 and CMD53 commands. Unrelated to this, but ... is it possible to send the entire 3-Byte register CSA address using a single command instead of 3 separate commands ? > Therefore, one approach to resolve this issue is to add sdio_claim/release > around wilc_sdio_set_func0_csa_address(). This way, this function will be > treated as a single operation and it will only modify the required command flow. I can do the serialization per-command, but please see above.
On 10/23/24 8:47 PM, Marek Vasut wrote: Hello again, >> Is power-save enabled during the test. With PS enabled, The SDIO >> commands may >> fail momentarily but it should recover. > > It seems it gets enabled after first ifconfig up, that's a good hint, > I'll try to disable it and see if that makes them errors go away. Thanks! > > Do you have any details on WHY would such sporadic errors occur and how > to make those go away even with PS enabled ? Can you explain why does uAPSD (iw ...set power_save off) adversely affect SDIO bus stability ? Can you explain how to prevent that or shall we disable uAPSD altogether ? Thank you
Hi Marek, On 11/4/24 04:44, Marek Vasut wrote: > EXTERNAL EMAIL: Do not click links or open attachments unless you know the > content is safe > > On 10/23/24 8:47 PM, Marek Vasut wrote: > > Hello again, > >>> Is power-save enabled during the test. With PS enabled, The SDIO >>> commands may >>> fail momentarily but it should recover. >> >> It seems it gets enabled after first ifconfig up, that's a good hint, >> I'll try to disable it and see if that makes them errors go away. Thanks! >> >> Do you have any details on WHY would such sporadic errors occur and how >> to make those go away even with PS enabled ? > Can you explain why does uAPSD (iw ...set power_save off) adversely > affect SDIO bus stability ? > SDIO bus errors can occur for different reasons and those errors can be of recoverable or non-recoverable type. For non-recoverable failures like firmware crashes, the retry mechanism may not help to resolve the issue. If the error is recoverable then driver should work with retry attempts. I think you are observing the bus errors messages and it is recovering after that. Is my understanding correct? With the previous shared test procedure, which makes the interface up/down continuously, the station may not go into the Doze/Awake sequence since that mode switching gets activated after connection with AP. > Can you explain how to prevent that or shall we disable uAPSD altogether ? Could you please share the test procedure and logs. I am occupied at the moment but I shall make some time to look into it and get a better understanding. Regard, Ajay
On 11/7/24 2:28 AM, Ajay.Kathat@microchip.com wrote: > Hi Marek, Hello Ajay, > On 11/4/24 04:44, Marek Vasut wrote: >> EXTERNAL EMAIL: Do not click links or open attachments unless you know the >> content is safe >> >> On 10/23/24 8:47 PM, Marek Vasut wrote: >> >> Hello again, >> >>>> Is power-save enabled during the test. With PS enabled, The SDIO >>>> commands may >>>> fail momentarily but it should recover. >>> >>> It seems it gets enabled after first ifconfig up, that's a good hint, >>> I'll try to disable it and see if that makes them errors go away. Thanks! >>> >>> Do you have any details on WHY would such sporadic errors occur and how >>> to make those go away even with PS enabled ? >> Can you explain why does uAPSD (iw ...set power_save off) adversely >> affect SDIO bus stability ? >> > > SDIO bus errors can occur for different reasons and those errors can be of > recoverable or non-recoverable type. For non-recoverable failures like > firmware crashes, the retry mechanism may not help to resolve the issue. If > the error is recoverable then driver should work with retry attempts. > I think you are observing the bus errors messages and it is recovering after > that. Is my understanding correct? I don't know. Is there any way to make the WILC firmware produce debug output , so we can figure out what is going on "on the other side" ? Are you able to provide me (maybe off-list) some debug firmware build ? (or can I get firmware sources and build and debug my own WILC firmware on the Cortus CPU?) > With the previous shared test procedure, which makes the interface up/down > continuously, the station may not go into the Doze/Awake sequence since that > mode switching gets activated after connection with AP. What does this mean ? I can trigger the SDIO errors even without being connected to any AP , so this is something between the WILC and the SDIO host, the radio is likely not involved , right ? >> Can you explain how to prevent that or shall we disable uAPSD altogether ? > > Could you please share the test procedure and logs. I am occupied at the > moment but I shall make some time to look into it and get a better understanding. The simplest test procedure is this: $ while true ; do ifconfig wlan0 up ; ifconfig wlan0 down ; done As for the logs, MMCI controller sporadically reports either Command or Data CRC error, so likely the SDIO response (from WILC to Host) is corrupted.
On 11/7/24 5:10 PM, Marek Vasut wrote: > On 11/7/24 2:28 AM, Ajay.Kathat@microchip.com wrote: >> Hi Marek, > > Hello Ajay, > >> On 11/4/24 04:44, Marek Vasut wrote: >>> EXTERNAL EMAIL: Do not click links or open attachments unless you >>> know the >>> content is safe >>> >>> On 10/23/24 8:47 PM, Marek Vasut wrote: >>> >>> Hello again, >>> >>>>> Is power-save enabled during the test. With PS enabled, The SDIO >>>>> commands may >>>>> fail momentarily but it should recover. >>>> >>>> It seems it gets enabled after first ifconfig up, that's a good hint, >>>> I'll try to disable it and see if that makes them errors go away. >>>> Thanks! >>>> >>>> Do you have any details on WHY would such sporadic errors occur and how >>>> to make those go away even with PS enabled ? >>> Can you explain why does uAPSD (iw ...set power_save off) adversely >>> affect SDIO bus stability ? >>> >> >> SDIO bus errors can occur for different reasons and those errors can >> be of >> recoverable or non-recoverable type. For non-recoverable failures like >> firmware crashes, the retry mechanism may not help to resolve the >> issue. If >> the error is recoverable then driver should work with retry attempts. >> I think you are observing the bus errors messages and it is recovering >> after >> that. Is my understanding correct? > > I don't know. Is there any way to make the WILC firmware produce debug > output , so we can figure out what is going on "on the other side" ? > > Are you able to provide me (maybe off-list) some debug firmware build ? > (or can I get firmware sources and build and debug my own WILC firmware > on the Cortus CPU?) > >> With the previous shared test procedure, which makes the interface up/ >> down >> continuously, the station may not go into the Doze/Awake sequence >> since that >> mode switching gets activated after connection with AP. > > What does this mean ? I can trigger the SDIO errors even without being > connected to any AP , so this is something between the WILC and the SDIO > host, the radio is likely not involved , right ? > >>> Can you explain how to prevent that or shall we disable uAPSD >>> altogether ? >> >> Could you please share the test procedure and logs. I am occupied at the >> moment but I shall make some time to look into it and get a better >> understanding. > > The simplest test procedure is this: > > $ while true ; do ifconfig wlan0 up ; ifconfig wlan0 down ; done > > As for the logs, MMCI controller sporadically reports either Command or > Data CRC error, so likely the SDIO response (from WILC to Host) is > corrupted. Are there any news ?
Hi Marek, On 11/15/24 07:33, Marek Vasut wrote: > EXTERNAL EMAIL: Do not click links or open attachments unless you know the > content is safe > > On 11/7/24 5:10 PM, Marek Vasut wrote: >> On 11/7/24 2:28 AM, Ajay.Kathat@microchip.com wrote: >>> Hi Marek, >> >> Hello Ajay, >> >>> On 11/4/24 04:44, Marek Vasut wrote: >>>> EXTERNAL EMAIL: Do not click links or open attachments unless you >>>> know the >>>> content is safe >>>> >>>> On 10/23/24 8:47 PM, Marek Vasut wrote: >>>> >>>> Hello again, >>>> >>>>>> Is power-save enabled during the test. With PS enabled, The SDIO >>>>>> commands may >>>>>> fail momentarily but it should recover. >>>>> >>>>> It seems it gets enabled after first ifconfig up, that's a good hint, >>>>> I'll try to disable it and see if that makes them errors go away. >>>>> Thanks! >>>>> >>>>> Do you have any details on WHY would such sporadic errors occur and how >>>>> to make those go away even with PS enabled ? >>>> Can you explain why does uAPSD (iw ...set power_save off) adversely >>>> affect SDIO bus stability ? >>>> >>> >>> SDIO bus errors can occur for different reasons and those errors can >>> be of >>> recoverable or non-recoverable type. For non-recoverable failures like >>> firmware crashes, the retry mechanism may not help to resolve the >>> issue. If >>> the error is recoverable then driver should work with retry attempts. >>> I think you are observing the bus errors messages and it is recovering >>> after >>> that. Is my understanding correct? >> >> I don't know. Is there any way to make the WILC firmware produce debug >> output , so we can figure out what is going on "on the other side" ? >> >> Are you able to provide me (maybe off-list) some debug firmware build ? >> (or can I get firmware sources and build and debug my own WILC firmware >> on the Cortus CPU?) >> >>> With the previous shared test procedure, which makes the interface up/ >>> down >>> continuously, the station may not go into the Doze/Awake sequence >>> since that >>> mode switching gets activated after connection with AP. >> >> What does this mean ? I can trigger the SDIO errors even without being >> connected to any AP , so this is something between the WILC and the SDIO >> host, the radio is likely not involved , right ? >> >>>> Can you explain how to prevent that or shall we disable uAPSD >>>> altogether ? >>> >>> Could you please share the test procedure and logs. I am occupied at the >>> moment but I shall make some time to look into it and get a better >>> understanding. >> >> The simplest test procedure is this: >> >> $ while true ; do ifconfig wlan0 up ; ifconfig wlan0 down ; done >> >> As for the logs, MMCI controller sporadically reports either Command or >> Data CRC error, so likely the SDIO response (from WILC to Host) is >> corrupted. > > Are there any news ? I did test the same procedure in my setup, but I couldn't reproduce this issue even after running it for a long duration. In my test setup, I used the sama5d27-som1-ek1 host and wilc3000 firmware version 16.3. I think this issue could be related to the host MMCI controller driver. Normally, the wilc SDIO bus failures are captured by driver logs with an error code (e.g., timeout), but if the MMCI controller is outputting the warning message, then the error could be related to it. Does the MMCI controller error point to any specific function? Which host was used to test this scenario, and is it possible to test with different host or different configuration on the same host, like disabling power save on the host? Regards, Ajay Regards, Ajay
On 11/15/24 9:04 PM, Ajay.Kathat@microchip.com wrote: Hello Ajay, >>>>> Can you explain how to prevent that or shall we disable uAPSD >>>>> altogether ? >>>> >>>> Could you please share the test procedure and logs. I am occupied at the >>>> moment but I shall make some time to look into it and get a better >>>> understanding. >>> >>> The simplest test procedure is this: >>> >>> $ while true ; do ifconfig wlan0 up ; ifconfig wlan0 down ; done >>> >>> As for the logs, MMCI controller sporadically reports either Command or >>> Data CRC error, so likely the SDIO response (from WILC to Host) is >>> corrupted. >> >> Are there any news ? > > I did test the same procedure in my setup, but I couldn't reproduce this issue > even after running it for a long duration. In my test setup, I used the > sama5d27-som1-ek1 host and wilc3000 firmware version 16.3. > > I think this issue could be related to the host MMCI controller driver. > Normally, the wilc SDIO bus failures are captured by driver logs with an error > code (e.g., timeout), but if the MMCI controller is outputting the warning > message, then the error could be related to it. Does the MMCI controller error > point to any specific function? Either CMD52 or CMD53 errors out with CRC error, this is recognized by the controller. That points to sporadic CRC error during SDIO transfer. > Which host was used to test this scenario, and > is it possible to test with different host or different configuration on the > same host I am observing sporadic command and data CRC errors on STM32MP157F system with SDIO WILC3000. , like disabling power save on the host? I already tested disabling power save. Can you explain why does uAPSD (iw ...set power_save off) adversely affect SDIO bus stability ? Can you explain how to prevent that or shall we disable uAPSD altogether ? Is there any way to make the WILC firmware produce debug output , so we can figure out what is going on "on the other side" ? Are you able to provide me (maybe off-list) some debug firmware build ? (or can I get firmware sources and build and debug my own WILC firmware on the Cortus CPU?) I can trigger the SDIO errors even without being connected to any AP , so this is something between the WILC and the SDIO host, the radio is likely not involved , right ?
On 11/16/24 8:57 PM, Marek Vasut wrote: > On 11/15/24 9:04 PM, Ajay.Kathat@microchip.com wrote: > > Hello Ajay, > >>>>>> Can you explain how to prevent that or shall we disable uAPSD >>>>>> altogether ? >>>>> >>>>> Could you please share the test procedure and logs. I am occupied >>>>> at the >>>>> moment but I shall make some time to look into it and get a better >>>>> understanding. >>>> >>>> The simplest test procedure is this: >>>> >>>> $ while true ; do ifconfig wlan0 up ; ifconfig wlan0 down ; done >>>> >>>> As for the logs, MMCI controller sporadically reports either Command or >>>> Data CRC error, so likely the SDIO response (from WILC to Host) is >>>> corrupted. >>> >>> Are there any news ? >> >> I did test the same procedure in my setup, but I couldn't reproduce >> this issue >> even after running it for a long duration. In my test setup, I used the >> sama5d27-som1-ek1 host and wilc3000 firmware version 16.3. >> >> I think this issue could be related to the host MMCI controller driver. >> Normally, the wilc SDIO bus failures are captured by driver logs with >> an error >> code (e.g., timeout), but if the MMCI controller is outputting the >> warning >> message, then the error could be related to it. Does the MMCI >> controller error >> point to any specific function? > > Either CMD52 or CMD53 errors out with CRC error, this is recognized by > the controller. That points to sporadic CRC error during SDIO transfer. > >> Which host was used to test this scenario, and >> is it possible to test with different host or different configuration >> on the >> same host > > I am observing sporadic command and data CRC errors on STM32MP157F > system with SDIO WILC3000. > > , like disabling power save on the host? > I already tested disabling power save. > > Can you explain why does uAPSD (iw ...set power_save off) adversely > affect SDIO bus stability ? > > Can you explain how to prevent that or shall we disable uAPSD altogether ? > > Is there any way to make the WILC firmware produce debug output , so we > can figure out what is going on "on the other side" ? > > Are you able to provide me (maybe off-list) some debug firmware build ? > (or can I get firmware sources and build and debug my own WILC firmware > on the Cortus CPU?) > > I can trigger the SDIO errors even without being connected to any AP , > so this is something between the WILC and the SDIO host, the radio is > likely not involved , right ? Any news ?
diff --git a/drivers/net/wireless/microchip/wilc1000/cfg80211.c b/drivers/net/wireless/microchip/wilc1000/cfg80211.c index b0dae6f7c633b..9a9fc8e8c8354 100644 --- a/drivers/net/wireless/microchip/wilc1000/cfg80211.c +++ b/drivers/net/wireless/microchip/wilc1000/cfg80211.c @@ -1730,7 +1730,6 @@ static const struct cfg80211_ops wilc_cfg80211_ops = { static void wlan_init_locks(struct wilc *wl) { - mutex_init(&wl->hif_cs); mutex_init(&wl->rxq_cs); mutex_init(&wl->cfg_cmd_lock); mutex_init(&wl->vif_mutex); @@ -1748,7 +1747,6 @@ static void wlan_init_locks(struct wilc *wl) void wlan_deinit_locks(struct wilc *wilc) { - mutex_destroy(&wilc->hif_cs); mutex_destroy(&wilc->rxq_cs); mutex_destroy(&wilc->cfg_cmd_lock); mutex_destroy(&wilc->txq_add_to_head_cs); diff --git a/drivers/net/wireless/microchip/wilc1000/netdev.c b/drivers/net/wireless/microchip/wilc1000/netdev.c index 7e84fc0fd9118..22c91a0b9b648 100644 --- a/drivers/net/wireless/microchip/wilc1000/netdev.c +++ b/drivers/net/wireless/microchip/wilc1000/netdev.c @@ -468,9 +468,7 @@ static void wilc_wlan_deinitialize(struct net_device *dev) if (!wl->dev_irq_num && wl->hif_func->disable_interrupt) { - mutex_lock(&wl->hif_cs); wl->hif_func->disable_interrupt(wl); - mutex_unlock(&wl->hif_cs); } complete(&wl->txq_event); diff --git a/drivers/net/wireless/microchip/wilc1000/netdev.h b/drivers/net/wireless/microchip/wilc1000/netdev.h index 95bc8b8fe65a5..8bdc27edf00af 100644 --- a/drivers/net/wireless/microchip/wilc1000/netdev.h +++ b/drivers/net/wireless/microchip/wilc1000/netdev.h @@ -240,9 +240,6 @@ struct wilc { /* protect rxq_entry_t receiver queue */ struct mutex rxq_cs; - /* lock to protect hif access */ - struct mutex hif_cs; - struct completion cfg_event; struct completion sync_event; struct completion txq_event; diff --git a/drivers/net/wireless/microchip/wilc1000/sdio.c b/drivers/net/wireless/microchip/wilc1000/sdio.c index 170470d1c2092..0796e2ccfe32e 100644 --- a/drivers/net/wireless/microchip/wilc1000/sdio.c +++ b/drivers/net/wireless/microchip/wilc1000/sdio.c @@ -56,11 +56,23 @@ struct sdio_cmd53 { static const struct wilc_hif_func wilc_hif_sdio; -static void wilc_sdio_interrupt(struct sdio_func *func) +static void wilc_sdio_claim(struct wilc *wilc) +{ + struct sdio_func *func = container_of(wilc->dev, struct sdio_func, dev); + + sdio_claim_host(func); +} + +static void wilc_sdio_release(struct wilc *wilc) { + struct sdio_func *func = container_of(wilc->dev, struct sdio_func, dev); + sdio_release_host(func); +} + +static void wilc_sdio_interrupt(struct sdio_func *func) +{ wilc_handle_isr(sdio_get_drvdata(func)); - sdio_claim_host(func); } static int wilc_sdio_cmd52(struct wilc *wilc, struct sdio_cmd52 *cmd) @@ -69,8 +81,6 @@ static int wilc_sdio_cmd52(struct wilc *wilc, struct sdio_cmd52 *cmd) int ret; u8 data; - sdio_claim_host(func); - func->num = cmd->function; if (cmd->read_write) { /* write */ if (cmd->raw) { @@ -85,8 +95,6 @@ static int wilc_sdio_cmd52(struct wilc *wilc, struct sdio_cmd52 *cmd) cmd->data = data; } - sdio_release_host(func); - if (ret) dev_err(&func->dev, "%s..failed, err(%d)\n", __func__, ret); return ret; @@ -99,8 +107,6 @@ static int wilc_sdio_cmd53(struct wilc *wilc, struct sdio_cmd53 *cmd) struct wilc_sdio *sdio_priv = wilc->bus_data; u8 *buf = cmd->buffer; - sdio_claim_host(func); - func->num = cmd->function; func->cur_blksize = cmd->block_size; if (cmd->block_mode) @@ -128,8 +134,6 @@ static int wilc_sdio_cmd53(struct wilc *wilc, struct sdio_cmd53 *cmd) memcpy(cmd->buffer, buf, size); } out: - sdio_release_host(func); - if (ret) dev_err(&func->dev, "%s..failed, err(%d)\n", __func__, ret); @@ -180,9 +184,11 @@ static int wilc_sdio_probe(struct sdio_func *func, goto dispose_irq; } + wilc_sdio_claim(wilc); wilc_sdio_init(wilc, false); ret = wilc_get_chipid(wilc); + wilc_sdio_release(wilc); if (ret) goto dispose_irq; @@ -196,7 +202,9 @@ static int wilc_sdio_probe(struct sdio_func *func, goto dispose_irq; } + wilc_sdio_claim(wilc); wilc_sdio_deinit(wilc); + wilc_sdio_release(wilc); vif = wilc_netdev_ifc_init(wilc, "wlan%d", WILC_STATION_MODE, NL80211_IFTYPE_STATION, false); @@ -258,9 +266,9 @@ static int wilc_sdio_enable_interrupt(struct wilc *dev) struct sdio_func *func = container_of(dev->dev, struct sdio_func, dev); int ret = 0; - sdio_claim_host(func); + wilc_sdio_claim(dev); ret = sdio_claim_irq(func, wilc_sdio_interrupt); - sdio_release_host(func); + wilc_sdio_release(dev); if (ret < 0) { dev_err(&func->dev, "can't claim sdio_irq, err(%d)\n", ret); @@ -274,11 +282,11 @@ static void wilc_sdio_disable_interrupt(struct wilc *dev) struct sdio_func *func = container_of(dev->dev, struct sdio_func, dev); int ret; - sdio_claim_host(func); + wilc_sdio_claim(dev); ret = sdio_release_irq(func); if (ret < 0) dev_err(&func->dev, "can't release sdio_irq, err(%d)\n", ret); - sdio_release_host(func); + wilc_sdio_release(dev); } /******************************************** @@ -1013,6 +1021,8 @@ static const struct wilc_hif_func wilc_hif_sdio = { .disable_interrupt = wilc_sdio_disable_interrupt, .hif_reset = wilc_sdio_reset, .hif_is_init = wilc_sdio_is_init, + .hif_claim = wilc_sdio_claim, + .hif_release = wilc_sdio_release, }; static int wilc_sdio_suspend(struct device *dev) @@ -1053,7 +1063,9 @@ static int wilc_sdio_resume(struct device *dev) if (!IS_ERR(wilc->rtc_clk)) clk_prepare_enable(wilc->rtc_clk); + wilc_sdio_claim(wilc); wilc_sdio_init(wilc, true); + wilc_sdio_release(wilc); wilc_sdio_enable_interrupt(wilc); return host_wakeup_notify(wilc); diff --git a/drivers/net/wireless/microchip/wilc1000/spi.c b/drivers/net/wireless/microchip/wilc1000/spi.c index ce2a9cdd6aa78..7af25ea6068f4 100644 --- a/drivers/net/wireless/microchip/wilc1000/spi.c +++ b/drivers/net/wireless/microchip/wilc1000/spi.c @@ -50,6 +50,9 @@ struct wilc_spi { struct gpio_desc *enable; /* ENABLE GPIO or NULL */ struct gpio_desc *reset; /* RESET GPIO or NULL */ } gpios; + + /* lock to protect hif access */ + struct mutex hif_cs; }; static const struct wilc_hif_func wilc_hif_spi; @@ -1105,6 +1108,15 @@ static int wilc_spi_write(struct wilc *wilc, u32 addr, u8 *buf, u32 size) * Bus interfaces * ********************************************/ +static void wilc_spi_claim(struct wilc *wilc) +{ + mutex_lock(&wilc->hif_cs); +} + +static void wilc_spi_release(struct wilc *wilc) +{ + mutex_unlock(&wilc->hif_cs); +} static int wilc_spi_reset(struct wilc *wilc) { @@ -1132,6 +1144,7 @@ static int wilc_spi_deinit(struct wilc *wilc) spi_priv->isinit = false; wilc_wlan_power(wilc, false); + mutex_destroy(&wilc->hif_cs); return 0; } @@ -1140,6 +1153,8 @@ static int wilc_spi_init(struct wilc *wilc, bool resume) struct wilc_spi *spi_priv = wilc->bus_data; int ret; + mutex_init(&spi->hif_cs); + if (spi_priv->isinit) { /* Confirm we can read chipid register without error: */ if (wilc_validate_chipid(wilc) == 0) diff --git a/drivers/net/wireless/microchip/wilc1000/wlan.c b/drivers/net/wireless/microchip/wilc1000/wlan.c index 9d80adc45d6be..b149734f19a05 100644 --- a/drivers/net/wireless/microchip/wilc1000/wlan.c +++ b/drivers/net/wireless/microchip/wilc1000/wlan.c @@ -767,25 +767,37 @@ static int chip_wakeup(struct wilc *wilc) static inline int acquire_bus(struct wilc *wilc, enum bus_acquire acquire) { - int ret = 0; + const struct wilc_hif_func *hif_func = wilc->hif_func; + int ret; - mutex_lock(&wilc->hif_cs); - if (acquire == WILC_BUS_ACQUIRE_AND_WAKEUP && wilc->power_save_mode) { - ret = chip_wakeup(wilc); - if (ret) - mutex_unlock(&wilc->hif_cs); - } + hif_func->hif_claim(wilc); + + if (!wilc->power_save_mode) + return 0; + if (acquire != WILC_BUS_ACQUIRE_AND_WAKEUP) + return 0; + + ret = chip_wakeup(wilc); + if (ret) + goto err; + + return 0; + +err: + hif_func->hif_release(wilc); return ret; } static inline int release_bus(struct wilc *wilc, enum bus_release release) { + const struct wilc_hif_func *hif_func = wilc->hif_func; int ret = 0; if (release == WILC_BUS_RELEASE_ALLOW_SLEEP && wilc->power_save_mode) ret = chip_allow_sleep(wilc); - mutex_unlock(&wilc->hif_cs); + + hif_func->hif_release(wilc); return ret; } @@ -1447,7 +1459,9 @@ void wilc_wlan_cleanup(struct net_device *dev) wilc->rx_buffer = NULL; kfree(wilc->tx_buffer); wilc->tx_buffer = NULL; + acquire_bus(wilc, WILC_BUS_ACQUIRE_AND_WAKEUP); wilc->hif_func->hif_deinit(wilc); + release_bus(wilc, WILC_BUS_RELEASE_ALLOW_SLEEP); } static int wilc_wlan_cfg_commit(struct wilc_vif *vif, int type, diff --git a/drivers/net/wireless/microchip/wilc1000/wlan.h b/drivers/net/wireless/microchip/wilc1000/wlan.h index b9e7f9222eadd..ade2db95e8a0f 100644 --- a/drivers/net/wireless/microchip/wilc1000/wlan.h +++ b/drivers/net/wireless/microchip/wilc1000/wlan.h @@ -403,6 +403,8 @@ struct wilc_hif_func { void (*disable_interrupt)(struct wilc *nic); int (*hif_reset)(struct wilc *wilc); bool (*hif_is_init)(struct wilc *wilc); + void (*hif_claim)(struct wilc *wilc); + void (*hif_release)(struct wilc *wilc); }; #define WILC_MAX_CFG_FRAME_SIZE 1468
The bus locking in this driver is broken and produces subtle race condition with ksdioirqd and its mmc_claim_host()/mmc_release_host() usage in case of SDIO bus. Rework the locking to avoid this race condition. The problem is the hif_cs mutex used in acquire_bus()/release_bus(), which makes it look like calling acquire_bus() results in exclusive access to the bus, but that is not true for SDIO bus. For SDIO bus, to obtain exclusive access (any access, really), it is necessary to call sdio_claim_host(), which is a wrapper around mmc_claim_host(), which does its own locking. The acquire_bus() does not do that, but the SDIO interface implementation does call sdio_claim_host() and sdio_release_host() every single command, which is problematic. To make things worse, wilc_sdio_interrupt() implementation called from ksdioirqd first calls sdio_release_host(), then interrupt handling and finally sdio_claim_host(). The core problem is that sdio_claim_host() cannot be done per command, but has to be done per register/data IO which consists of multiple commands. Usually the WILC register read/write consists of 3x CMD52 to push in CSA pointer address and 1x CMD53 to read/write data to that address. Most other accesses are also composed of multiple commands. Currently, if ksdioirqd wakes up and attempts to read SDIO_CCCR_INTx to get pending SDIO IRQs in sdio_get_pending_irqs(), it can easily perform that transfer between two consecutive CMD52 which are pushing in the CSA pointer address and possibly disrupt the WILC operation. This is undesired behavior. Rework the locking. Introduce new .hif_claim/.hif_release callbacks which implement bus specific locking. Lock/unlock SDIO bus access using sdio_claim_host() and sdio_release_host(), lock/unlock SPI bus access using the current hif_cs mutex moved purely into the spi.c interface. Make acquire_bus() and release_bus() call the .hif_claim/.hif_release() callbacks and do not access the hif_cs mutex from there at all. Remove any SDIO bus locking used directly in commands and the broken SDIO bus unlocking in wilc_sdio_interrupt(), this is no longer needed. Fix up SDIO initialization code which newly needs sdio_claim_host() and sdio_release_host(), since it cannot depend on the locking being done per-command anymore. Signed-off-by: Marek Vasut <marex@denx.de> --- Cc: "David S. Miller" <davem@davemloft.net> Cc: Adham Abozaeid <adham.abozaeid@microchip.com> Cc: Ajay Singh <ajay.kathat@microchip.com> Cc: Alexis Lothoré <alexis.lothore@bootlin.com> Cc: Claudiu Beznea <claudiu.beznea@tuxon.dev> Cc: Conor Dooley <conor+dt@kernel.org> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Kalle Valo <kvalo@kernel.org> Cc: Krzysztof Kozlowski <krzk+dt@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Rob Herring <robh@kernel.org> Cc: devicetree@vger.kernel.org Cc: linux-wireless@vger.kernel.org Cc: netdev@vger.kernel.org --- NOTE: I only tested the SDIO part --- .../wireless/microchip/wilc1000/cfg80211.c | 2 - .../net/wireless/microchip/wilc1000/netdev.c | 2 - .../net/wireless/microchip/wilc1000/netdev.h | 3 -- .../net/wireless/microchip/wilc1000/sdio.c | 40 ++++++++++++------- drivers/net/wireless/microchip/wilc1000/spi.c | 15 +++++++ .../net/wireless/microchip/wilc1000/wlan.c | 30 ++++++++++---- .../net/wireless/microchip/wilc1000/wlan.h | 2 + 7 files changed, 65 insertions(+), 29 deletions(-)