Message ID | YlogluONIoc1VTCI@bombadil.infradead.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] blktests: replace module removal with patient module removal | expand |
On 4/15/22 18:49, Luis Chamberlain wrote: > - if ! modprobe -r null_blk || ! modprobe null_blk "$@" "${zoned}" ; then > - return 1 > - fi > + _patient_rmmod null_blk || return 1 > + modprobe null_blk "$@" "${zoned}" || 1 "1" is not a valid command. Should "|| 1" perhaps be changed into "|| return 1"? > +_has_modprobe_patient() > +{ > + modprobe --help >& /dev/null || return 1 > + modprobe --help | grep -q -1 "remove-patiently" || return 1 > + return 0 > +} I can't find the meaning of "-1" in the grep man page. Did I perhaps overlook something? > +# checks the refcount and returns 0 if we can safely remove the module. rmmod > +# does this check for us, but we can use this to also iterate checking for this > +# refcount before we even try to remove the module. This is useful when using > +# debug test modules which take a while to quiesce. > +_patient_rmmod_check_refcnt() > +{ > + local module=$1 > + local refcnt=0 > + > + if [[ -f "/sys/module/$module/refcnt" ]]; then > + refcnt=$(cat "/sys/module/$module/refcnt" 2>/dev/null) > + if [[ $? -ne 0 || $refcnt -eq 0 ]]; then > + return 0 > + fi > + return 1 > + fi > + return 0 > +} Hmm ... why is the check for existence of the refcnt separate from reading the refcnt? I think that just reading the refcnt should be sufficient. Additionally, that will avoid the race where the module is unloaded after the check and before the refcnt is read. > - modprobe -r nvme-"${nvme_trtype}" 2>/dev/null > - if [[ "${nvme_trtype}" != "loop" ]]; then > - modprobe -r nvmet-"${nvme_trtype}" 2>/dev/null > - fi > - modprobe -r nvmet 2>/dev/null > + if [[ "${nvme_trtype}" == "loop" ]]; then > + _patient_rmmod nvme_"${nvme_trtype}" > + else > + _patient_rmmod nvme-"${nvme_trtype}" > + _patient_rmmod nvmet-"${nvme_trtype}" > + fi > + _patient_rmmod nvmet 2>/dev/null The statement _patient_rmmod nvme-"${nvme_trtype}" occurs twice in the above code. How about preserving the structure of the existing code such that that statement only occurs once? > # Unload the SRP initiator driver. > stop_srp_ini() { > - local i > - > log_out > - for ((i=40;i>=0;i--)); do > - remove_mpath_devs || return $? > - unload_module ib_srp >/dev/null 2>&1 && break > - sleep 1 > - done > - if [ -e /sys/module/ib_srp ]; then > - echo "Error: unloading kernel module ib_srp failed" > - return 1 > - fi > - unload_module scsi_transport_srp || return $? > + remove_mpath_devs || return $? > + _patient_rmmod ib_srp || return 1 > + _patient_rmmod scsi_transport_srp || return $? > } Removing the loop from around remove_mpath_devs is wrong. It is important that that loop is preserved. Thanks, Bart.
On Fri, Apr 15, 2022 at 09:01:03PM -0700, Bart Van Assche wrote: > On 4/15/22 18:49, Luis Chamberlain wrote: > > - if ! modprobe -r null_blk || ! modprobe null_blk "$@" "${zoned}" ; then > > - return 1 > > - fi > > + _patient_rmmod null_blk || return 1 > > + modprobe null_blk "$@" "${zoned}" || 1 > > "1" is not a valid command. Should "|| 1" perhaps be changed into "|| return > 1"? That was a typo, fixed. > > +_has_modprobe_patient() > > +{ > > + modprobe --help >& /dev/null || return 1 > > + modprobe --help | grep -q -1 "remove-patiently" || return 1 > > + return 0 > > +} > > I can't find the meaning of "-1" in the grep man page. Did I perhaps > overlook something? That's shorthand for -C 1, but we can remove it as it is not needed. > > +# checks the refcount and returns 0 if we can safely remove the module. rmmod > > +# does this check for us, but we can use this to also iterate checking for this > > +# refcount before we even try to remove the module. This is useful when using > > +# debug test modules which take a while to quiesce. > > +_patient_rmmod_check_refcnt() > > +{ > > + local module=$1 > > + local refcnt=0 > > + > > + if [[ -f "/sys/module/$module/refcnt" ]]; then > > + refcnt=$(cat "/sys/module/$module/refcnt" 2>/dev/null) > > + if [[ $? -ne 0 || $refcnt -eq 0 ]]; then > > + return 0 > > + fi > > + return 1 > > + fi > > + return 0 > > +} > > Hmm ... why is the check for existence of the refcnt separate from reading > the refcnt? I think that just reading the refcnt should be sufficient. > Additionally, that will avoid the race where the module is unloaded after > the check and before the refcnt is read. We can certainly simplify it as follows: _patient_rmmod_check_refcnt() { local module=$1 local refcnt=0 refcnt=$(cat "/sys/module/$module/refcnt" 2>/dev/null) if [[ $? -ne 0 || $refcnt -eq 0 ]]; then return 0 fi return 1 } > > - modprobe -r nvme-"${nvme_trtype}" 2>/dev/null > > - if [[ "${nvme_trtype}" != "loop" ]]; then > > - modprobe -r nvmet-"${nvme_trtype}" 2>/dev/null > > - fi > > - modprobe -r nvmet 2>/dev/null > > + if [[ "${nvme_trtype}" == "loop" ]]; then > > + _patient_rmmod nvme_"${nvme_trtype}" > > + else > > + _patient_rmmod nvme-"${nvme_trtype}" > > + _patient_rmmod nvmet-"${nvme_trtype}" > > + fi > > + _patient_rmmod nvmet 2>/dev/null > > The statement _patient_rmmod nvme-"${nvme_trtype}" occurs twice in the above > code. How about preserving the structure of the existing code such that that > statement only occurs once? There is one call for nvme-"${nvme_trtype}", the other is for the underscore version, so there are no two calls. Did I miss something? > > # Unload the SRP initiator driver. > > stop_srp_ini() { > > - local i > > - > > log_out > > - for ((i=40;i>=0;i--)); do > > - remove_mpath_devs || return $? > > - unload_module ib_srp >/dev/null 2>&1 && break > > - sleep 1 > > - done > > - if [ -e /sys/module/ib_srp ]; then > > - echo "Error: unloading kernel module ib_srp failed" > > - return 1 > > - fi > > - unload_module scsi_transport_srp || return $? > > + remove_mpath_devs || return $? > > + _patient_rmmod ib_srp || return 1 > > + _patient_rmmod scsi_transport_srp || return $? > > } > > Removing the loop from around remove_mpath_devs is wrong. It is important > that that loop is preserved. Why ? Can you test and verify? I can explain why I'm removing it. Code which typically used to try to insist on a module removal were just running into situations where the refcnt got bumped but they could not explain why, and the reason is that module refcnt's are finicky. *Anything* in userspace can easily trigger this, and this is module specific. While doing a module removal, if you are running into this, the only thing you can do is to patiently wait until userspace is done with whatever it may try to do. The old re-try attempts are buggy because of this. A future mechansim for kmod will be to allow userspace to try to remove first the modules which hold a refcnt, but that does not do anything for whatever userspace might do. These entry points from userspace are completely module specific and cannot be generalized (nor do I think we want to add semantics to the kernelf or it). So the only thing one can sensibly do, specially in test frameworks, is to wait and use a timeout for it. Luis
On 4/18/22 08:57, Luis Chamberlain wrote: > On Fri, Apr 15, 2022 at 09:01:03PM -0700, Bart Van Assche wrote: >> On 4/15/22 18:49, Luis Chamberlain wrote: >>> - modprobe -r nvme-"${nvme_trtype}" 2>/dev/null >>> - if [[ "${nvme_trtype}" != "loop" ]]; then >>> - modprobe -r nvmet-"${nvme_trtype}" 2>/dev/null >>> - fi >>> - modprobe -r nvmet 2>/dev/null >>> + if [[ "${nvme_trtype}" == "loop" ]]; then >>> + _patient_rmmod nvme_"${nvme_trtype}" >>> + else >>> + _patient_rmmod nvme-"${nvme_trtype}" >>> + _patient_rmmod nvmet-"${nvme_trtype}" >>> + fi >>> + _patient_rmmod nvmet 2>/dev/null >> >> The statement _patient_rmmod nvme-"${nvme_trtype}" occurs twice in the above >> code. How about preserving the structure of the existing code such that that >> statement only occurs once? > > There is one call for nvme-"${nvme_trtype}", the other is for the > underscore version, so there are no two calls. > > Did I miss something? It's only now that I see the underscore/hyphen difference in the if and else branches. It is not clear to me why this behavior change has been introduced? The following command produces no output on my test setup: find /lib/modules -name 'nvme_*' >>> # Unload the SRP initiator driver. >>> stop_srp_ini() { >>> - local i >>> - >>> log_out >>> - for ((i=40;i>=0;i--)); do >>> - remove_mpath_devs || return $? >>> - unload_module ib_srp >/dev/null 2>&1 && break >>> - sleep 1 >>> - done >>> - if [ -e /sys/module/ib_srp ]; then >>> - echo "Error: unloading kernel module ib_srp failed" >>> - return 1 >>> - fi >>> - unload_module scsi_transport_srp || return $? >>> + remove_mpath_devs || return $? >>> + _patient_rmmod ib_srp || return 1 >>> + _patient_rmmod scsi_transport_srp || return $? >>> } >> >> Removing the loop from around remove_mpath_devs is wrong. It is important >> that that loop is preserved. > > Why ? Can you test and verify? If I remember correctly I put remove_mpath_devs call inside the loop because multipathd keeps running while the loop is ongoing and hence can modify paths while the loop is running. Thanks, Bart.
On Mon, Apr 18, 2022 at 09:39:06AM -0700, Bart Van Assche wrote: > On 4/18/22 08:57, Luis Chamberlain wrote: > > On Fri, Apr 15, 2022 at 09:01:03PM -0700, Bart Van Assche wrote: > > > On 4/15/22 18:49, Luis Chamberlain wrote: > > > > - modprobe -r nvme-"${nvme_trtype}" 2>/dev/null > > > > - if [[ "${nvme_trtype}" != "loop" ]]; then > > > > - modprobe -r nvmet-"${nvme_trtype}" 2>/dev/null > > > > - fi > > > > - modprobe -r nvmet 2>/dev/null > > > > + if [[ "${nvme_trtype}" == "loop" ]]; then > > > > + _patient_rmmod nvme_"${nvme_trtype}" > > > > + else > > > > + _patient_rmmod nvme-"${nvme_trtype}" > > > > + _patient_rmmod nvmet-"${nvme_trtype}" > > > > + fi > > > > + _patient_rmmod nvmet 2>/dev/null > > > > > > The statement _patient_rmmod nvme-"${nvme_trtype}" occurs twice in the above > > > code. How about preserving the structure of the existing code such that that > > > statement only occurs once? > > > > There is one call for nvme-"${nvme_trtype}", the other is for the > > underscore version, so there are no two calls. > > > > Did I miss something? > > It's only now that I see the underscore/hyphen difference in the if and else > branches. It is not clear to me why this behavior change has been introduced? > The following command produces no output on my test setup: > > find /lib/modules -name 'nvme_*' Try: find /sys/module/ -maxdepth 1 -type d | grep loop I get: /sys/module/nvme_loop Blame scripts/Makefile.lib which does the s|-|_| for KBUILD_MODNAME: name-fix-token = $(subst $(comma),_,$(subst -,_,$1)) name-fix = $(call stringify,$(call name-fix-token,$1)) basename_flags = -DKBUILD_BASENAME=$(call name-fix,$(basetarget)) modname_flags = -DKBUILD_MODNAME=$(call name-fix,$(modname)) \ -D__KBUILD_MODNAME=kmod_$(call name-fix-token,$(modname)) I'm thinking now it is best to do this hack here: diff --git a/common/rc b/common/rc index 67bba70..cff0eb2 100644 --- a/common/rc +++ b/common/rc @@ -390,7 +390,10 @@ _patient_rmmod_check_refcnt() # ourselves. _patient_rmmod() { - local module=$1 + # Since we are looking for a directory we must adopt the + # specific directory used by scripts/Makefile.lib for + # KBUILD_MODNAME + local module=$(echo $1 | sed 's|-|_|g') local max_tries_max=$MODPROBE_PATIENT_RM_TIMEOUT_SECONDS local max_tries=0 local mod_ret=0 That would allow us to keep the old order, and in fact would be correct for the other targets as well. > > > > # Unload the SRP initiator driver. > > > > stop_srp_ini() { > > > > - local i > > > > - > > > > log_out > > > > - for ((i=40;i>=0;i--)); do > > > > - remove_mpath_devs || return $? > > > > - unload_module ib_srp >/dev/null 2>&1 && break > > > > - sleep 1 > > > > - done > > > > - if [ -e /sys/module/ib_srp ]; then > > > > - echo "Error: unloading kernel module ib_srp failed" > > > > - return 1 > > > > - fi > > > > - unload_module scsi_transport_srp || return $? > > > > + remove_mpath_devs || return $? > > > > + _patient_rmmod ib_srp || return 1 > > > > + _patient_rmmod scsi_transport_srp || return $? > > > > } > > > > > > Removing the loop from around remove_mpath_devs is wrong. It is important > > > that that loop is preserved. > > > > Why ? Can you test and verify? > > If I remember correctly I put remove_mpath_devs call inside the loop because > multipathd keeps running while the loop is ongoing and hence can modify paths > while the loop is running. I suspect you ran into the issue of the refcnt being bumped by anything multipathd tried, and not being able to remove the module, but if it is adding *new* mpath devices that would seem like a bug which we'd need to address. The point to the patient module removal is to keep on trying until the recnt is 0 and if that fails wait and keep trying, until the the timeout. Anytihng userspace does, say multipathd does, like bdev_open(), would be yielded to. So I'd like to keep this change as it is exactly the sort of hack I am chasing after with this crusade. Let me know how you'd like to proceed so I can punt a v3. Luis
On 5/11/22 10:44, Luis Chamberlain wrote: > I suspect you ran into the issue of the refcnt being bumped by anything > multipathd tried, and not being able to remove the module, but if it is > adding *new* mpath devices that would seem like a bug which we'd need to > address. The point to the patient module removal is to keep on trying > until the recnt is 0 and if that fails wait and keep trying, until the > the timeout. Anytihng userspace does, say multipathd does, like > bdev_open(), would be yielded to. > > So I'd like to keep this change as it is exactly the sort of hack I am > chasing after with this crusade. > > Let me know how you'd like to proceed so I can punt a v3. Please implement the patient module removal and the stop_srp_ini() behavior changes as separate patches such that the stop_srp_ini() behavioral changes can be reverted easily in case these would trigger a regression. Thanks, Bart.
On Wed, May 11, 2022 at 11:22:52AM -0700, Bart Van Assche wrote: > On 5/11/22 10:44, Luis Chamberlain wrote: > > I suspect you ran into the issue of the refcnt being bumped by anything > > multipathd tried, and not being able to remove the module, but if it is > > adding *new* mpath devices that would seem like a bug which we'd need to > > address. The point to the patient module removal is to keep on trying > > until the recnt is 0 and if that fails wait and keep trying, until the > > the timeout. Anytihng userspace does, say multipathd does, like > > bdev_open(), would be yielded to. > > > > So I'd like to keep this change as it is exactly the sort of hack I am > > chasing after with this crusade. > > > > Let me know how you'd like to proceed so I can punt a v3. > > Please implement the patient module removal and the stop_srp_ini() behavior > changes as separate patches such that the stop_srp_ini() behavioral changes > can be reverted easily in case these would trigger a regression. I had dropped the ball here, because well, it takes a bit of time to re-test everything. And also just around this time I was also ironing out how to properly automate testing for srp. That's all done now and I finally had time to re-test so will send a v3 out shortly. Luis
diff --git a/common/multipath-over-rdma b/common/multipath-over-rdma index cef05ec92d29..a7821c7469f5 100644 --- a/common/multipath-over-rdma +++ b/common/multipath-over-rdma @@ -4,6 +4,7 @@ # # Functions and global variables used by both the srp and nvmeof-mp tests. +. common/rc . common/shellcheck . common/null_blk @@ -427,14 +428,8 @@ stop_soft_rdma() { echo "$i ..." rdma link del "${i}" || echo "Failed to remove ${i}" done - if ! unload_module rdma_rxe 10; then - echo "Unloading rdma_rxe failed" - return 1 - fi - if ! unload_module siw 10; then - echo "Unloading siw failed" - return 1 - fi + _patient_rmmod rdma_rxe || return 1 + _patient_rmmod siw || return 1 } >>"$FULL" } diff --git a/common/null_blk b/common/null_blk index 6611db03a00e..6c816579c0b8 100644 --- a/common/null_blk +++ b/common/null_blk @@ -4,6 +4,8 @@ # # null_blk helper functions. +. common/rc + _have_null_blk() { _have_modules null_blk } @@ -21,9 +23,8 @@ _init_null_blk() { local zoned="" if (( RUN_FOR_ZONED )); then zoned="zoned=1"; fi - if ! modprobe -r null_blk || ! modprobe null_blk "$@" "${zoned}" ; then - return 1 - fi + _patient_rmmod null_blk || return 1 + modprobe null_blk "$@" "${zoned}" || 1 udevadm settle return 0 @@ -46,5 +47,5 @@ _configure_null_blk() { _exit_null_blk() { _remove_null_blk_devices udevadm settle - modprobe -r null_blk + _patient_rmmod null_blk } diff --git a/common/rc b/common/rc index 0528ce808256..67bba70eb019 100644 --- a/common/rc +++ b/common/rc @@ -321,15 +321,148 @@ _uptime_s() { awk '{ print int($1) }' /proc/uptime } -# Arguments: module to unload ($1) and retry count ($2). -unload_module() { - local i m=$1 rc=${2:-1} - - [ ! -e "/sys/module/$m" ] && return 0 - for ((i=rc;i>0;i--)); do - modprobe -r "$m" - [ ! -e "/sys/module/$m" ] && return 0 - sleep .1 +_has_modprobe_patient() +{ + modprobe --help >& /dev/null || return 1 + modprobe --help | grep -q -1 "remove-patiently" || return 1 + return 0 +} + +# Set MODPROBE_PATIENT_RM_TIMEOUT_SECONDS to "forever" if you want the patient +# modprobe removal to run forever trying to remove a module. +MODPROBE_REMOVE_PATIENT="" +if ! _has_modprobe_patient; then + if [[ -z "$MODPROBE_PATIENT_RM_TIMEOUT_SECONDS" ]]; then + # We will open code our own implementation of patient module + # remover in blktests. Use a 50 second default. + export MODPROBE_PATIENT_RM_TIMEOUT_SECONDS="50" + fi +else + MODPROBE_RM_PATIENT_TIMEOUT_ARGS="" + if [[ -n "$MODPROBE_PATIENT_RM_TIMEOUT_SECONDS" ]]; then + if [[ "$MODPROBE_PATIENT_RM_TIMEOUT_SECONDS" != "forever" ]]; then + MODPROBE_PATIENT_RM_TIMEOUT_MS="$((MODPROBE_PATIENT_RM_TIMEOUT_SECONDS * 1000))" + MODPROBE_RM_PATIENT_TIMEOUT_ARGS="-t $MODPROBE_PATIENT_RM_TIMEOUT_MS" + fi + else + # We export MODPROBE_PATIENT_RM_TIMEOUT_SECONDS here for parity + # with environments without support for modprobe -p, but we + # only really need it exported right now for environments which + # don't have support for modprobe -p to implement our own + # patient module removal support within fstests. + export MODPROBE_PATIENT_RM_TIMEOUT_SECONDS="50" + MODPROBE_PATIENT_RM_TIMEOUT_MS="$((MODPROBE_PATIENT_RM_TIMEOUT_SECONDS * 1000))" + MODPROBE_RM_PATIENT_TIMEOUT_ARGS="-t $MODPROBE_PATIENT_RM_TIMEOUT_MS" + fi + MODPROBE_REMOVE_PATIENT="modprobe -p $MODPROBE_RM_PATIENT_TIMEOUT_ARGS" +fi +export MODPROBE_REMOVE_PATIENT + +# checks the refcount and returns 0 if we can safely remove the module. rmmod +# does this check for us, but we can use this to also iterate checking for this +# refcount before we even try to remove the module. This is useful when using +# debug test modules which take a while to quiesce. +_patient_rmmod_check_refcnt() +{ + local module=$1 + local refcnt=0 + + if [[ -f "/sys/module/$module/refcnt" ]]; then + refcnt=$(cat "/sys/module/$module/refcnt" 2>/dev/null) + if [[ $? -ne 0 || $refcnt -eq 0 ]]; then + return 0 + fi + return 1 + fi + return 0 +} + +# Tries to wait patiently to remove a module by ensuring first +# the refcnt is 0 and then trying to remove the module over and over +# again within the time allowed. The timeout is configurable per test, just set +# MODPROBE_PATIENT_RM_TIMEOUT_SECONDS prior to including this file. +# If you want this to try forever just set MODPROBE_PATIENT_RM_TIMEOUT_SECONDS +# to the special value of "forever". This applies to both cases where kmod +# supports the patient module remover (modrobe -p) and where it does not. +# +# If your version of kmod supports modprobe -p, we instead use that +# instead. Otherwise we have to implement a patient module remover +# ourselves. +_patient_rmmod() +{ + local module=$1 + local max_tries_max=$MODPROBE_PATIENT_RM_TIMEOUT_SECONDS + local max_tries=0 + local mod_ret=0 + local refcnt_is_zero=0 + + if [[ -n $MODPROBE_REMOVE_PATIENT ]]; then + $MODPROBE_REMOVE_PATIENT "$module" + mod_ret=$? + if [[ $mod_ret -ne 0 ]]; then + echo "kmod patient module removal for $module timed out waiting for refcnt to become 0 using timeout of $max_tries_max returned $mod_ret" + fi + return $mod_ret + fi + + max_tries=$max_tries_max + + # We must use a string check as otherwise if max_tries is set to + # "forever" and we don't use a string check we can end up skipping + # entering this loop. + while [[ "$max_tries" != "0" ]]; do + if _patient_rmmod_check_refcnt "$module"; then + refcnt_is_zero=1 + break + fi + sleep 1 + if [[ "$max_tries" == "forever" ]]; then + continue + fi + ((max_tries--)) done - return 1 + + if [[ $refcnt_is_zero -ne 1 ]]; then + echo "custom patient module removal for $module timed out waiting for refcnt to become 0 using timeout of $max_tries_max" + return 1 + fi + + # If we ran out of time but our refcnt check confirms we had + # a refcnt of 0, just try to remove the module once. + if [[ "$max_tries" == "0" ]]; then + modprobe -r "$module" + return $? + fi + + # If we have extra time left. Use the time left to now try to + # persistently remove the module. We do this because although through + # the above we found refcnt to be 0, removal can still fail since + # userspace can always race to bump the refcnt. An example is any + # blkdev_open() calls against a block device. These issues have been + # tracked and documented in the following bug reports, which justifies + # our need to do this in userspace: + # https://bugzilla.kernel.org/show_bug.cgi?id=212337 + # https://bugzilla.kernel.org/show_bug.cgi?id=214015 + while [[ $max_tries != 0 ]]; do + if [[ -d /sys/module/$module ]]; then + modprobe -r "$module" 2> /dev/null + mod_ret=$? + if [[ $mod_ret == 0 ]]; then + break; + fi + sleep 1 + if [[ "$max_tries" == "forever" ]]; then + continue + fi + ((max_tries--)) + else + break + fi + done + + if [[ $mod_ret -ne 0 ]]; then + echo "custom patient module removal for $module timed out trying to remove $module using timeout of $max_tries_max last try returned $mod_ret" + fi + + return $mod_ret } diff --git a/common/scsi_debug b/common/scsi_debug index b48cdc9e2b4d..9987286086bf 100644 --- a/common/scsi_debug +++ b/common/scsi_debug @@ -4,14 +4,15 @@ # # scsi_debug helper functions. +. common/rc + _have_scsi_debug() { _have_modules scsi_debug } _init_scsi_debug() { - if ! modprobe -r scsi_debug || ! modprobe scsi_debug "$@"; then - return 1 - fi + _patient_rmmod scsi_debug || return 1 + modprobe scsi_debug "$@" || return 1 udevadm settle @@ -51,5 +52,5 @@ _exit_scsi_debug() { unset SCSI_DEBUG_TARGETS unset SCSI_DEBUG_DEVICES udevadm settle - modprobe -r scsi_debug + _patient_rmmod scsi_debug } diff --git a/tests/nvme/rc b/tests/nvme/rc index 3c38408a0bfe..c124401228d7 100644 --- a/tests/nvme/rc +++ b/tests/nvme/rc @@ -125,11 +125,13 @@ _cleanup_nvmet() { shopt -u nullglob trap SIGINT - modprobe -r nvme-"${nvme_trtype}" 2>/dev/null - if [[ "${nvme_trtype}" != "loop" ]]; then - modprobe -r nvmet-"${nvme_trtype}" 2>/dev/null - fi - modprobe -r nvmet 2>/dev/null + if [[ "${nvme_trtype}" == "loop" ]]; then + _patient_rmmod nvme_"${nvme_trtype}" + else + _patient_rmmod nvme-"${nvme_trtype}" + _patient_rmmod nvmet-"${nvme_trtype}" + fi + _patient_rmmod nvmet 2>/dev/null if [[ "${nvme_trtype}" == "rdma" ]]; then stop_soft_rdma fi diff --git a/tests/nvmeof-mp/rc b/tests/nvmeof-mp/rc index dcb2e3c71093..54d8c95d05c4 100755 --- a/tests/nvmeof-mp/rc +++ b/tests/nvmeof-mp/rc @@ -164,12 +164,13 @@ start_nvme_client() { } stop_nvme_client() { - unload_module nvme-rdma || return $? - unload_module nvme-fabrics || return $? + _patient_rmmod nvme-rdma || return 1 + _patient_rmmod nvme-fabrics || return 1 # Ignore nvme and nvme-core unload errors - this test may be run on a # system equipped with one or more NVMe SSDs. - unload_module nvme >&/dev/null - unload_module nvme-core >&/dev/null + export MODPROBE_PATIENT_RM_TIMEOUT_SECONDS=5 + _patient_rmmod nvme + _patient_rmmod nvme-core return 0 } @@ -269,9 +270,9 @@ stop_nvme_target() { rmdir "$d" done ) - unload_module nvmet_rdma && - unload_module nvmet && - _exit_null_blk + _patient_rmmod nvmet_rdma || return 1 + _patient_rmmod nvmet || return 1 + _exit_null_blk } start_target() { diff --git a/tests/srp/rc b/tests/srp/rc index d44082af39de..fd5dd099106a 100755 --- a/tests/srp/rc +++ b/tests/srp/rc @@ -323,19 +323,10 @@ start_srp_ini() { # Unload the SRP initiator driver. stop_srp_ini() { - local i - log_out - for ((i=40;i>=0;i--)); do - remove_mpath_devs || return $? - unload_module ib_srp >/dev/null 2>&1 && break - sleep 1 - done - if [ -e /sys/module/ib_srp ]; then - echo "Error: unloading kernel module ib_srp failed" - return 1 - fi - unload_module scsi_transport_srp || return $? + remove_mpath_devs || return $? + _patient_rmmod ib_srp || return 1 + _patient_rmmod scsi_transport_srp || return $? } # Associate the LIO device with name $1/$2 with file $3 and SCSI serial $4. @@ -494,7 +485,7 @@ start_lio_srpt() { if modinfo ib_srpt | grep -q '^parm:[[:blank:]]*rdma_cm_port:'; then opts+=("rdma_cm_port=${srp_rdma_cm_port}") fi - unload_module ib_srpt + _patient_rmmod ib_srpt modprobe ib_srpt "${opts[@]}" || return $? i=0 for r in "${vdev_path[@]}"; do @@ -556,7 +547,7 @@ stop_lio_srpt() { target_core_file target_core_stgt target_core_user \ target_core_mod do - unload_module $m 10 || return $? + _patient_rmmod $m || return $? done }