Message ID | 20230504131414.3826283-6-pvorel@suse.cz (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | NFS: test on btrfs, ext4, xfs filesystems | expand |
Hi! > nfs_get_remote_path() > @@ -210,6 +213,7 @@ nfs_cleanup() > grep -q "$local_dir" /proc/mounts && umount $local_dir > n=$(( n + 1 )) > done > + sleep 2 > > n=0 > for i in $VERSION; do > @@ -219,12 +223,15 @@ nfs_cleanup() > if tst_net_use_netns; then > if test -d $remote_dir; then > exportfs -u *:$remote_dir > + sleep 1 > rm -rf $remote_dir > fi > else > tst_rhost_run -c "test -d $remote_dir && exportfs -u *:$remote_dir" > + sleep 1 > tst_rhost_run -c "test -d $remote_dir && rm -rf $remote_dir" > fi > + > n=$(( n + 1 )) > done Generally I'm not happy about the sleeps in the code, the main problem is that the test still may fail in a case that something was slower than usuall and we decided to proceed after we slept for a pre-defined interval. Ideally these should be replaced with a active waiting, i.e. loop that checks some condition 10 times in a second or so. Hoewever I'm okay with getting this in so that we manage to have these tests in before the release and fixing it later on.
> Hi! > > nfs_get_remote_path() > > @@ -210,6 +213,7 @@ nfs_cleanup() > > grep -q "$local_dir" /proc/mounts && umount $local_dir > > n=$(( n + 1 )) > > done > > + sleep 2 > > n=0 > > for i in $VERSION; do > > @@ -219,12 +223,15 @@ nfs_cleanup() > > if tst_net_use_netns; then > > if test -d $remote_dir; then > > exportfs -u *:$remote_dir > > + sleep 1 > > rm -rf $remote_dir > > fi > > else > > tst_rhost_run -c "test -d $remote_dir && exportfs -u *:$remote_dir" > > + sleep 1 > > tst_rhost_run -c "test -d $remote_dir && rm -rf $remote_dir" > > fi > > + > > n=$(( n + 1 )) > > done > Generally I'm not happy about the sleeps in the code, the main problem > is that the test still may fail in a case that something was slower than > usuall and we decided to proceed after we slept for a pre-defined > interval. Ideally these should be replaced with a active waiting, i.e. > loop that checks some condition 10 times in a second or so. Hoewever I'm > okay with getting this in so that we manage to have these tests in > before the release and fixing it later on. I wonder myself what is wrong and if there is a problem with my code (likely) or with nfsd or with loop device (Christoph Hellwig has been changing drivers/block/loop.c quite a lot). Because if first umount is too early, it does not recover (i.e. tst_umount contain 50 times trying to umount with 100ms, and that does not help). I also wonder how should I detect it's ready to be umounted. I'll have look if losetup "fuser -vm /dev/loop0" would help, but instead of depending on the device, we'd probably want to look into /proc/mounts, right? Tomorrow I'll test this: check_umount() { local i local dir="$1" local type="${2:-lhost}" local cmd="grep -q $dir /proc/mounts" for i in $(seq 50); do if [ "$type" = "lhost" ]; then $cmd || return else tst_rhost_run -c "$cmd" || return fi tst_sleep 100ms tst_res TWARN "failed to umount '$dir'" done } for i in $VERSION; do local_dir="$(get_local_dir $i $n)" grep -q "$local_dir" /proc/mounts && umount $local_dir # instead of 'sleep 2' below check here: check_umount "$local_dir" n=$(( n + 1 )) done n=0 for i in $VERSION; do type=$(get_socket_type $n) remote_dir="$(get_remote_dir $i $type)" if tst_net_use_netns; then if test -d $remote_dir; then exportfs -u *:$remote_dir check_umount "$remote_dir" # instead of sleep 1 rm -rf $remote_dir fi else tst_rhost_run -c "test -d $remote_dir && exportfs -u *:$remote_dir" check_umount "$remote_dir" rhost # instead of sleep 1 tst_rhost_run -c "test -d $remote_dir && rm -rf $remote_dir" fi n=$(( n + 1 )) done [1] https://gitlab.com/psmisc/psmisc/-/blob/master/src/fuser.c
Hi, ... > check_umount() > { > local i > local dir="$1" > local type="${2:-lhost}" > local cmd="grep -q $dir /proc/mounts" > for i in $(seq 50); do > if [ "$type" = "lhost" ]; then > $cmd || return > else > tst_rhost_run -c "$cmd" || return > fi > tst_sleep 100ms > tst_res TWARN "failed to umount '$dir'" > done Obviously TWARN should be here. ... Kind regards, Petr
Hi all, ... > Tomorrow I'll test this: > check_umount() > { > local i > local dir="$1" > local type="${2:-lhost}" > local cmd="grep -q $dir /proc/mounts" > for i in $(seq 50); do > if [ "$type" = "lhost" ]; then > $cmd || return > else > tst_rhost_run -c "$cmd" || return > fi > tst_sleep 100ms > tst_res TWARN "failed to umount '$dir'" > done > } > for i in $VERSION; do > local_dir="$(get_local_dir $i $n)" > grep -q "$local_dir" /proc/mounts && umount $local_dir > # instead of 'sleep 2' below check here: > check_umount "$local_dir" > n=$(( n + 1 )) > done > n=0 > for i in $VERSION; do > type=$(get_socket_type $n) > remote_dir="$(get_remote_dir $i $type)" > if tst_net_use_netns; then > if test -d $remote_dir; then > exportfs -u *:$remote_dir > check_umount "$remote_dir" # instead of sleep 1 > rm -rf $remote_dir > fi > else > tst_rhost_run -c "test -d $remote_dir && exportfs -u *:$remote_dir" > check_umount "$remote_dir" rhost # instead of sleep 1 > tst_rhost_run -c "test -d $remote_dir && rm -rf $remote_dir" > fi > n=$(( n + 1 )) > done I tried various code (not only the one - wrong - above), but nothing works. With fuser -vm, I wonder if there is some bug in older nfs versions: USER PID ACCESS COMMAND /tmp/LTP_nfs08.fncaxJum2W/mntpoint/4/tcp: root kernel mount /tmp/LTP_nfs08.fncaxJum2W/mntpoint root kernel knfsd /tmp/LTP_nfs08.fncaxJum2W/mntpoint/4 Looking at c6c7f2a84da4 ("nfsd: Ensure knfsd shuts down when the "nfsd" pseudofs is unmounted"), maybe there are other bugs, which causes files are used. I also tried to use /proc/fs/nfsd/unlock_filesystem and /proc/fs/nfsd/write_unlock_ip which made things better a bit (6.2.12), but still it's not 100% solution. Also, "Release all locks on a local file system" (unlock_filesystem) and Release all locks used by a client (write_unlock_ip) are unlikely what I'm searching. umount $local_dir echo "$local_dir" > /proc/fs/nfsd/unlock_filesystem # $(tst_ipaddr rhost) => IP of the server echo "$(tst_ipaddr rhost)" > /proc/fs/nfsd/write_unlock_ip ... exportfs -u *:$remote_dir echo "$remote_dir" > /proc/fs/nfsd/unlock_filesystem => any hint is welcome. Kind regards, Petr
Hi! Let's get this merged with the sleeps for now and lets try to fix it properly after the release. You can push the patchset with my Ack.
Hi Cyril, > Hi! > Let's get this merged with the sleeps for now and lets try to fix it > properly after the release. You can push the patchset with my Ack. Patchset merged, I guess we can freeze LTP before the release. Thanks! Kind regards, Petr
diff --git a/testcases/network/nfs/nfs_stress/nfs_lib.sh b/testcases/network/nfs/nfs_stress/nfs_lib.sh index 042fea5e4..abf7ba5a2 100644 --- a/testcases/network/nfs/nfs_stress/nfs_lib.sh +++ b/testcases/network/nfs/nfs_stress/nfs_lib.sh @@ -28,7 +28,10 @@ NFS_PARSE_ARGS_CALLER="$TST_PARSE_ARGS" TST_OPTS="v:t:$TST_OPTS" TST_PARSE_ARGS=nfs_parse_args TST_USAGE=nfs_usage -TST_NEEDS_TMPDIR=1 +TST_ALL_FILESYSTEMS=1 +TST_SKIP_FILESYSTEMS="exfat,ext2,ext3,fuse,ntfs,vfat,tmpfs" +TST_MOUNT_DEVICE=1 +TST_FORMAT_DEVICE=1 TST_NEEDS_ROOT=1 TST_NEEDS_CMDS="$TST_NEEDS_CMDS mount exportfs mount.nfs" TST_SETUP="${TST_SETUP:-nfs_setup}" @@ -68,7 +71,7 @@ get_remote_dir() local v="$1" local n="$2" - echo "$TST_TMPDIR/$v/$n" + echo "$TST_MNTPOINT/$v/$n" } nfs_get_remote_path() @@ -210,6 +213,7 @@ nfs_cleanup() grep -q "$local_dir" /proc/mounts && umount $local_dir n=$(( n + 1 )) done + sleep 2 n=0 for i in $VERSION; do @@ -219,12 +223,15 @@ nfs_cleanup() if tst_net_use_netns; then if test -d $remote_dir; then exportfs -u *:$remote_dir + sleep 1 rm -rf $remote_dir fi else tst_rhost_run -c "test -d $remote_dir && exportfs -u *:$remote_dir" + sleep 1 tst_rhost_run -c "test -d $remote_dir && rm -rf $remote_dir" fi + n=$(( n + 1 )) done }
because there are some problems with at least vfat, exfat and tmpfs (on nfs-utils < 2) instead of testing on all available filesystems, test just on modern Linux filesystems btrfs, ext4, xfs. Use variables: * TST_ALL_FILESYSTEMS=1 to run on all filesystems * TST_FORMAT_DEVICE=1 to get loop device formatted * TST_MOUNT_DEVICE=1 to get it mounted Filesystems (tested the usual LTP way on loop device) are used for server side (exportfs), client side (NFS mount) is kept outside of it. For some reason umounting needs some time before NFS server stops using underlying loop device. Also exportfs needs time before files can be removed. Otherwise second umounting fails: nfs07 4 TINFO: Cleaning up testcase umount: /var/tmp/LTP_nfs07.FNZ7yCbqZe/mntpoint: target is busy. nfs07 4 TINFO: umount(/var/tmp/LTP_nfs07.FNZ7yCbqZe/mntpoint) failed, try 1 ... nfs07 4 TINFO: Likely gvfsd-trash is probing newly mounted fs, kill it to speed up tests. umount: /var/tmp/LTP_nfs07.FNZ7yCbqZe/mntpoint: target is busy. Solved with adding sleep for 2 sec after first umount and sleeps for 1 sec after exportfs. Second umount of the loop device in tst_test.sh works without any extra sleep. Signed-off-by: Petr Vorel <pvorel@suse.cz> --- testcases/network/nfs/nfs_stress/nfs_lib.sh | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)