Message ID | 20250119151214.23562-1-James.Bottomley@HansenPartnership.com (mailing list archive) |
---|---|
Headers | show |
Series | convert efivarfs to manage object data correctly | expand |
On Sun, 19 Jan 2025 at 16:12, James Bottomley <James.Bottomley@hansenpartnership.com> wrote: > > I've added fsdevel because I'm hopping some kind vfs person will check > the shift from efivarfs managing its own data to its data being > managed as part of the vfs object lifetimes. The following paragraph > should describe all you need to know about the unusual features of the > filesystem. > > efivarfs is a filesystem projecting the current state of the UEFI > variable store and allowing updates via write. Because EFI variables > contain both contents and a set of attributes, which can't be mapped > to filesystem data, the u32 attribute is prepended to the output of > the file and, since UEFI variables can't be empty, this makes every > file at least 5 characters long. EFI variables can be removed either > by doing an unlink (easy) or by doing a conventional write update that > reduces the content to zero size, which means any write update can > potentially remove the file. > > Currently efivarfs has two bugs: it leaks memory and if a create is > attempted that results in an error in the write, it creates a zero > length file remnant that doesn't represent an EFI variable (i.e. the > state reflection of the EFI variable store goes out of sync). > > The code uses inode->i_private to point to additionaly allocated > information but tries to maintain a global list of the shadowed > varibles for internal tracking. Forgetting to kfree() entries in this > list when they are deleted is the source of the memory leak. > > I've tried to make the patches as easily reviewable by non-EFI people > as possible, so some possible cleanups (like consolidating or removing > the efi lock handling and possibly removing the additional entry > allocation entirely in favour of simply converting the dentry name to > the variable name and guid) are left for later. > > The first patch removes some unused fields in the entry; patches 2-3 > eliminate the list search for duplication (some EFI variable stores > have buggy iterators) and replaces it with a dcache lookup. Patch 4 > move responsibility for freeing the entry data to > inode_alloc/inode_free which both fixes the memory leak and also means > we no longer need to iterate over the variable list and free its > entries in kill_sb. Since the variable list is now unused, patch 5 > removes it and its helper functions. > > Patch 6 fixes the second bug by introducing a file_operations->release > method that checks to see if the inode size is zero when the file is > closed and removes it if it is. Since all files must be at least 5 in > length we use a zero i_size as an indicator that either the variable > was removed on write or that it wasn't correctly created in the first > place. > > Patch 7 fixes the old self tests which check for zero length files > on incorrect variable creation (these are now removed). > > Patch 8 adds a new set of self tests for multi threaded variable > updates checking for the new behaviour. > > v2: folded in feedback from Al Viro: check errors on lookup and delete > zero length file on last close > > v3: convert to alloc/free instead of evict and use a boolean in > efivar_entry under the inode lock to indicate removal and add > additional selftests > Thanks James. I've queued up this version now, so we'll get some coverage from the robots. I'll redo my own testing tomorrow, but I'll omit these changes from my initial PR to Linus. If we're confident that things are sound, I'll send another PR during the second half of the merge window.
On Sun, 19 Jan 2025 at 17:59, Ard Biesheuvel <ardb@kernel.org> wrote: > > On Sun, 19 Jan 2025 at 16:12, James Bottomley > <James.Bottomley@hansenpartnership.com> wrote: > > ... > > Thanks James. I've queued up this version now, so we'll get some > coverage from the robots. I'll redo my own testing tomorrow, but I'll > omit these changes from my initial PR to Linus. If we're confident > that things are sound, I'll send another PR during the second half of > the merge window. I'm hitting the failure cases below. The first one appears to hit the same 'Operation not permitted' condition on the write, the error message is just hidden by the /dev/null redirect. I'm running the make command from a root shell. Using printf from the command line works happily so I suspect there is some issue with the concurrency and the subshells? # -------------------- # running test_multiple_zero_size # -------------------- # [FAIL] # -------------------- # running test_multiple_create # -------------------- # ./efivarfs.sh: line 294: /sys/firmware/efi/efivars/test_multiple-210be57c-9849-4fc7-a635-e6382d1aec27: Operation not permitted # [FAIL] # -------------------- # running test_multiple_delete_on_write # -------------------- # [PASS] not ok 1 selftests: efivarfs: efivarfs.sh # exit=1
On Mon, 2025-01-20 at 17:31 +0100, Ard Biesheuvel wrote: > On Sun, 19 Jan 2025 at 17:59, Ard Biesheuvel <ardb@kernel.org> wrote: > > > > On Sun, 19 Jan 2025 at 16:12, James Bottomley > > <James.Bottomley@hansenpartnership.com> wrote: > > > > ... > > > > Thanks James. I've queued up this version now, so we'll get some > > coverage from the robots. I'll redo my own testing tomorrow, but > > I'll > > omit these changes from my initial PR to Linus. If we're confident > > that things are sound, I'll send another PR during the second half > > of > > the merge window. > > I'm hitting the failure cases below. The first one appears to hit the > same 'Operation not permitted' condition on the write, the error > message is just hidden by the /dev/null redirect. > > I'm running the make command from a root shell. Using printf from the > command line works happily so I suspect there is some issue with the > concurrency and the subshells? It could be that the file isn't opened until the subshell is spawned. I can probably use the pipe to wait for the subshell to start; I'll try to code that up. Regards, James
On Mon, 2025-01-20 at 13:57 -0500, James Bottomley wrote: > On Mon, 2025-01-20 at 17:31 +0100, Ard Biesheuvel wrote: > > On Sun, 19 Jan 2025 at 17:59, Ard Biesheuvel <ardb@kernel.org> > > wrote: > > > > > > On Sun, 19 Jan 2025 at 16:12, James Bottomley > > > <James.Bottomley@hansenpartnership.com> wrote: > > > > > > ... > > > > > > Thanks James. I've queued up this version now, so we'll get some > > > coverage from the robots. I'll redo my own testing tomorrow, but > > > I'll omit these changes from my initial PR to Linus. If we're > > > confident that things are sound, I'll send another PR during the > > > second half of the merge window. > > > > I'm hitting the failure cases below. The first one appears to hit > > the same 'Operation not permitted' condition on the write, the > > error message is just hidden by the /dev/null redirect. > > > > I'm running the make command from a root shell. Using printf from > > the command line works happily so I suspect there is some issue > > with the concurrency and the subshells? > > It could be that the file isn't opened until the subshell is spawned. > I can probably use the pipe to wait for the subshell to start; I'll > try to code that up. OK, this is what I came up with. It works for me (but then so did the other script ... I think my testing VM is probably a little slow). Regards, James --->8>8>8<8<8<8--- From: James Bottomley <James.Bottomley@HansenPartnership.com> Subject: [PATCH] selftests/efivarfs: add concurrent update tests The delete on last close functionality can now only be tested properly by using multiple threads to hold open the variable files and testing what happens as they complete. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> --- diff --git a/tools/testing/selftests/efivarfs/efivarfs.sh b/tools/testing/selftests/efivarfs/efivarfs.sh index 4a84a810dc2c..57e2e151b8d9 100755 --- a/tools/testing/selftests/efivarfs/efivarfs.sh +++ b/tools/testing/selftests/efivarfs/efivarfs.sh @@ -227,6 +227,136 @@ test_no_set_size() exit $ret } +setup_test_multiple() +{ + ## + # we're going to do multi-threaded tests, so create a set of + # pipes for synchronization. We use pipes 1..3 to start the + # stalled shell job and pipes 4..6 as indicators that the job + # has started. If you need more than 3 jobs the two +3's below + # need increasing + ## + + declare -ag p + + # empty is because arrays number from 0 but jobs number from 1 + p[0]="" + + for f in 1 2 3 4 5 6; do + p[$f]=/tmp/efivarfs_pipe${f} + mknod ${p[$f]} p + done + + declare -g var=$efivarfs_mount/test_multiple-$test_guid + + cleanup() { + for f in ${p[@]}; do + rm -f ${f} + done + if [ -e $var ]; then + file_cleanup $var + fi + } + trap cleanup exit + + waitstart() { + cat ${p[$[$1+3]]} > /dev/null + } + + waitpipe() { + echo 1 > ${p[$[$1+3]]} + cat ${p[$1]} > /dev/null + } + + endjob() { + echo 1 > ${p[$1]} + wait -n %$1 + } +} + +test_multiple_zero_size() +{ + ## + # check for remove on last close, set up three threads all + # holding the variable (one write and two reads) and then + # close them sequentially (waiting for completion) and check + # the state of the variable + ## + + { waitpipe 1; echo 1; } > $var 2> /dev/null & + waitstart 1 + # zero length file should exist + [ -e $var ] || exit 1 + # second and third delayed close + { waitpipe 2; } < $var & + waitstart 2 + { waitpipe 3; } < $var & + waitstart 3 + # close first fd + endjob 1 + # var should only be deleted on last close + [ -e $var ] || exit 1 + # close second fd + endjob 2 + [ -e $var ] || exit 1 + # file should go on last close + endjob 3 + [ ! -e $var ] || exit 1 +} + +test_multiple_create() +{ + ## + # set multiple threads to access the variable but delay + # the final write to check the close of 2 and 3. The + # final write should succeed in creating the variable + ## + { waitpipe 1; printf '\x07\x00\x00\x00\x54'; } > $var & + waitstart 1 + [ -e $var -a ! -s $var ] || exit 1 + { waitpipe 2; } < $var & + waitstart 2 + { waitpipe 3; } < $var & + waitstart 3 + # close second and third fds + endjob 2 + # var should only be created (have size) on last close + [ -e $var -a ! -s $var ] || exit 1 + endjob 3 + [ -e $var -a ! -s $var ] || exit 1 + # close first fd + endjob 1 + # variable should still exist + [ -s $var ] || exit 1 + file_cleanup $var +} + +test_multiple_delete_on_write() { + ## + # delete the variable on final write; seqencing similar + # to test_multiple_create() + ## + printf '\x07\x00\x00\x00\x54' > $var + chattr -i $var + { waitpipe 1; printf '\x07\x00\x00\x00'; } > $var & + waitstart 1 + [ -e $var -a -s $var ] || exit 1 + { waitpipe 2; } < $var & + waitstart 2 + { waitpipe 3; } < $var & + waitstart 3 + # close first fd; write should set variable size to zero + endjob 1 + # var should only be deleted on last close + [ -e $var -a ! -s $var ] || exit 1 + endjob 2 + [ -e $var ] || exit 1 + # close last fd + endjob 3 + # variable should now be removed + [ ! -e $var ] || exit 1 +} + check_prereqs rc=0 @@ -240,5 +370,9 @@ run_test test_open_unlink run_test test_valid_filenames run_test test_invalid_filenames run_test test_no_set_size +setup_test_multiple +run_test test_multiple_zero_size +run_test test_multiple_create +run_test test_multiple_delete_on_write exit $rc
On Tue, 21 Jan 2025 at 14:53, James Bottomley <James.Bottomley@hansenpartnership.com> wrote: > > On Mon, 2025-01-20 at 13:57 -0500, James Bottomley wrote: > > On Mon, 2025-01-20 at 17:31 +0100, Ard Biesheuvel wrote: > > > On Sun, 19 Jan 2025 at 17:59, Ard Biesheuvel <ardb@kernel.org> > > > wrote: > > > > > > > > On Sun, 19 Jan 2025 at 16:12, James Bottomley > > > > <James.Bottomley@hansenpartnership.com> wrote: > > > > > > > > ... > > > > > > > > Thanks James. I've queued up this version now, so we'll get some > > > > coverage from the robots. I'll redo my own testing tomorrow, but > > > > I'll omit these changes from my initial PR to Linus. If we're > > > > confident that things are sound, I'll send another PR during the > > > > second half of the merge window. > > > > > > I'm hitting the failure cases below. The first one appears to hit > > > the same 'Operation not permitted' condition on the write, the > > > error message is just hidden by the /dev/null redirect. > > > > > > I'm running the make command from a root shell. Using printf from > > > the command line works happily so I suspect there is some issue > > > with the concurrency and the subshells? > > > > It could be that the file isn't opened until the subshell is spawned. > > I can probably use the pipe to wait for the subshell to start; I'll > > try to code that up. > > OK, this is what I came up with. It works for me (but then so did the > other script ... I think my testing VM is probably a little slow). > Thanks - that works on my end too. I am testing this on a bare metal arm64 machine, but an ancient and very slow one. So it might be the other way around. I am going to grab this version and queue it up. Thanks.