Message ID | 20210421092317.68716-1-amir73il@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | Test overlayfs readdir cache | expand |
On Wed, Apr 21, 2021 at 12:23 PM Amir Goldstein <amir73il@gmail.com> wrote: > > Eryu, > > This extends the generic t_dir_offset2 test to verify > some more test cases and adds a new generic test which > passes on overlayfs (and other fs) on upstream kernel. > > The overlayfs specific test fails on upstream kernel > and the fix commit is currently in linux-next. > As usual, you may want to wait with merging until the fix > commit hits upstream. > > Miklos, > > I had noticed in the test full logs that readdir of > a merged dir behaves strangely - when seeking backwards > to offsets > 0, readdir returns unlinked entries in results. > The test does not fail on that behavior because the test > only asserts that this is not allowed after seek to offset 0. > > Knowing the implementation of overlayfs readdir cache this is > not surprising to me, but I wonder if this behavior is POSIX > compliant, and if not, whether we should document it and/or > add a failing test for it. > I think the outcome could be worse. If a copied up entry is unlinked after populating the dir cache but before ovl_cache_update_ino() then ovl_cache_update_ino() and subsequently the getdents call will fail with ENOENT. This test is not smart enough to cover this case (if it really exists). I think we need to relax the case of negative lookup result in ovl_cache_update_ino() and just set p->whiteout without and continue with no error. This can solve several test cases. In principle, we can "semi-reset" the merge dir cache if the dir was modified after every seek, not just seek to 0. By "semi-reset" I mean use the list, but force ovl_cache_update_ino() to all upper entries, similar to ovl_dir_read_impure(). OR.. we can just do that unconditionally in ovl_iterate(). The ovl dentry cache for the children will be populated after the first ovl_iterate() anyway, so maybe the penalty is not so bad? Thanks, Amir.
On Thu, Apr 22, 2021 at 8:18 AM Amir Goldstein <amir73il@gmail.com> wrote: > > On Wed, Apr 21, 2021 at 12:23 PM Amir Goldstein <amir73il@gmail.com> wrote: > > > > Eryu, > > > > This extends the generic t_dir_offset2 test to verify > > some more test cases and adds a new generic test which > > passes on overlayfs (and other fs) on upstream kernel. > > > > The overlayfs specific test fails on upstream kernel > > and the fix commit is currently in linux-next. > > As usual, you may want to wait with merging until the fix > > commit hits upstream. > > > > Miklos, > > > > I had noticed in the test full logs that readdir of > > a merged dir behaves strangely - when seeking backwards > > to offsets > 0, readdir returns unlinked entries in results. > > The test does not fail on that behavior because the test > > only asserts that this is not allowed after seek to offset 0. > > > > Knowing the implementation of overlayfs readdir cache this is > > not surprising to me, but I wonder if this behavior is POSIX > > compliant, and if not, whether we should document it and/or > > add a failing test for it. > > > > I think the outcome could be worse. > If a copied up entry is unlinked after populating the dir cache > but before ovl_cache_update_ino() then ovl_cache_update_ino() > and subsequently the getdents call will fail with ENOENT. > > This test is not smart enough to cover this case (if it really exists). > I think we need to relax the case of negative lookup result in > ovl_cache_update_ino() and just set p->whiteout without and > continue with no error. > > This can solve several test cases. > In principle, we can "semi-reset" the merge dir cache if the dir was > modified after every seek, not just seek to 0. > By "semi-reset" I mean use the list, but force ovl_cache_update_ino() > to all upper entries, similar to ovl_dir_read_impure(). > > OR.. we can just do that unconditionally in ovl_iterate(). > The ovl dentry cache for the children will be populated after the first > ovl_iterate() anyway, so maybe the penalty is not so bad? POSIX does allow stale readdir data (not just in case of non-zero seek): "If a file is removed from or added to the directory after the most recent call to opendir() or rewinddir(), whether a subsequent call to readdir() returns an entry for that file is unspecified." If you think about the way readdir(3) is implemented by the libc, this is inevitable. Returning ENOENT from readdir(3) is obviously a bug. The merge case being not super high performance is perfectly okay. The only thing I've worried about in that case is unbound memory use, but apparently that hasn't been an issue in practice. Thanks, Miklos
On Thu, Apr 22, 2021 at 10:53 AM Miklos Szeredi <miklos@szeredi.hu> wrote: > > On Thu, Apr 22, 2021 at 8:18 AM Amir Goldstein <amir73il@gmail.com> wrote: > > > > On Wed, Apr 21, 2021 at 12:23 PM Amir Goldstein <amir73il@gmail.com> wrote: > > > > > > Eryu, > > > > > > This extends the generic t_dir_offset2 test to verify > > > some more test cases and adds a new generic test which > > > passes on overlayfs (and other fs) on upstream kernel. > > > > > > The overlayfs specific test fails on upstream kernel > > > and the fix commit is currently in linux-next. > > > As usual, you may want to wait with merging until the fix > > > commit hits upstream. > > > > > > Miklos, > > > > > > I had noticed in the test full logs that readdir of > > > a merged dir behaves strangely - when seeking backwards > > > to offsets > 0, readdir returns unlinked entries in results. > > > The test does not fail on that behavior because the test > > > only asserts that this is not allowed after seek to offset 0. > > > > > > Knowing the implementation of overlayfs readdir cache this is > > > not surprising to me, but I wonder if this behavior is POSIX > > > compliant, and if not, whether we should document it and/or > > > add a failing test for it. > > > > > > > I think the outcome could be worse. > > If a copied up entry is unlinked after populating the dir cache > > but before ovl_cache_update_ino() then ovl_cache_update_ino() > > and subsequently the getdents call will fail with ENOENT. > > > > This test is not smart enough to cover this case (if it really exists). > > I think we need to relax the case of negative lookup result in > > ovl_cache_update_ino() and just set p->whiteout without and > > continue with no error. > > > > This can solve several test cases. > > In principle, we can "semi-reset" the merge dir cache if the dir was > > modified after every seek, not just seek to 0. > > By "semi-reset" I mean use the list, but force ovl_cache_update_ino() > > to all upper entries, similar to ovl_dir_read_impure(). > > > > OR.. we can just do that unconditionally in ovl_iterate(). > > The ovl dentry cache for the children will be populated after the first > > ovl_iterate() anyway, so maybe the penalty is not so bad? > > POSIX does allow stale readdir data (not just in case of non-zero seek): > > "If a file is removed from or added to the directory after the most > recent call to opendir() or rewinddir(), whether a subsequent call to > readdir() returns an entry for that file is unspecified." > > If you think about the way readdir(3) is implemented by the libc, this > is inevitable. I see. In that case, I would defer merging this test as it assumes too much about readdir behavior (even though applications may expect this behavior). > > Returning ENOENT from readdir(3) is obviously a bug. > > The merge case being not super high performance is perfectly okay. > The only thing I've worried about in that case is unbound memory use, > but apparently that hasn't been an issue in practice. > Okay, so I will try to reproduce the ENOENT and fix it. In any case, even if the bug exists it's not urgent. Thanks, Amir.
On Thu, Apr 22, 2021 at 10:47 AM Amir Goldstein <amir73il@gmail.com> wrote: > > On Thu, Apr 22, 2021 at 10:53 AM Miklos Szeredi <miklos@szeredi.hu> wrote: > > > > On Thu, Apr 22, 2021 at 8:18 AM Amir Goldstein <amir73il@gmail.com> wrote: > > > > > > On Wed, Apr 21, 2021 at 12:23 PM Amir Goldstein <amir73il@gmail.com> wrote: > > > > > > > > Eryu, > > > > > > > > This extends the generic t_dir_offset2 test to verify > > > > some more test cases and adds a new generic test which > > > > passes on overlayfs (and other fs) on upstream kernel. > > > > > > > > The overlayfs specific test fails on upstream kernel > > > > and the fix commit is currently in linux-next. > > > > As usual, you may want to wait with merging until the fix > > > > commit hits upstream. > > > > > > > > Miklos, > > > > > > > > I had noticed in the test full logs that readdir of > > > > a merged dir behaves strangely - when seeking backwards > > > > to offsets > 0, readdir returns unlinked entries in results. > > > > The test does not fail on that behavior because the test > > > > only asserts that this is not allowed after seek to offset 0. > > > > > > > > Knowing the implementation of overlayfs readdir cache this is > > > > not surprising to me, but I wonder if this behavior is POSIX > > > > compliant, and if not, whether we should document it and/or > > > > add a failing test for it. > > > > > > > > > > I think the outcome could be worse. > > > If a copied up entry is unlinked after populating the dir cache > > > but before ovl_cache_update_ino() then ovl_cache_update_ino() > > > and subsequently the getdents call will fail with ENOENT. > > > > > > This test is not smart enough to cover this case (if it really exists). > > > I think we need to relax the case of negative lookup result in > > > ovl_cache_update_ino() and just set p->whiteout without and > > > continue with no error. > > > > > > This can solve several test cases. > > > In principle, we can "semi-reset" the merge dir cache if the dir was > > > modified after every seek, not just seek to 0. > > > By "semi-reset" I mean use the list, but force ovl_cache_update_ino() > > > to all upper entries, similar to ovl_dir_read_impure(). > > > > > > OR.. we can just do that unconditionally in ovl_iterate(). > > > The ovl dentry cache for the children will be populated after the first > > > ovl_iterate() anyway, so maybe the penalty is not so bad? > > > > POSIX does allow stale readdir data (not just in case of non-zero seek): > > > > "If a file is removed from or added to the directory after the most > > recent call to opendir() or rewinddir(), whether a subsequent call to > > readdir() returns an entry for that file is unspecified." > > > > If you think about the way readdir(3) is implemented by the libc, this > > is inevitable. > > I see. In that case, I would defer merging this test as it assumes too much > about readdir behavior (even though applications may expect this behavior). FWIW, I started writing a readdir stress/validator similar to fsx-linux but for directories. It's unfinished and has performance problem as the directory grows due to linear searches. Putting it out in case someone wants to continue working on it, or just take some ideas. Thanks, Miklos
On Thu, Apr 22, 2021 at 10:53 AM Miklos Szeredi <miklos@szeredi.hu> wrote: > > On Thu, Apr 22, 2021 at 8:18 AM Amir Goldstein <amir73il@gmail.com> wrote: > > > > On Wed, Apr 21, 2021 at 12:23 PM Amir Goldstein <amir73il@gmail.com> wrote: > > > > > > Eryu, > > > > > > This extends the generic t_dir_offset2 test to verify > > > some more test cases and adds a new generic test which > > > passes on overlayfs (and other fs) on upstream kernel. > > > > > > The overlayfs specific test fails on upstream kernel > > > and the fix commit is currently in linux-next. > > > As usual, you may want to wait with merging until the fix > > > commit hits upstream. > > > > > > Miklos, > > > > > > I had noticed in the test full logs that readdir of > > > a merged dir behaves strangely - when seeking backwards > > > to offsets > 0, readdir returns unlinked entries in results. > > > The test does not fail on that behavior because the test > > > only asserts that this is not allowed after seek to offset 0. > > > > > > Knowing the implementation of overlayfs readdir cache this is > > > not surprising to me, but I wonder if this behavior is POSIX > > > compliant, and if not, whether we should document it and/or > > > add a failing test for it. > > > > > > > I think the outcome could be worse. > > If a copied up entry is unlinked after populating the dir cache > > but before ovl_cache_update_ino() then ovl_cache_update_ino() > > and subsequently the getdents call will fail with ENOENT. > > > > This test is not smart enough to cover this case (if it really exists). > > I think we need to relax the case of negative lookup result in > > ovl_cache_update_ino() and just set p->whiteout without and > > continue with no error. > > > > This can solve several test cases. > > In principle, we can "semi-reset" the merge dir cache if the dir was > > modified after every seek, not just seek to 0. > > By "semi-reset" I mean use the list, but force ovl_cache_update_ino() > > to all upper entries, similar to ovl_dir_read_impure(). > > > > OR.. we can just do that unconditionally in ovl_iterate(). > > The ovl dentry cache for the children will be populated after the first > > ovl_iterate() anyway, so maybe the penalty is not so bad? > > POSIX does allow stale readdir data (not just in case of non-zero seek): > > "If a file is removed from or added to the directory after the most > recent call to opendir() or rewinddir(), whether a subsequent call to > readdir() returns an entry for that file is unspecified." > > If you think about the way readdir(3) is implemented by the libc, this > is inevitable. > That makes the test I posted wrong, because it expects the dir modifications to be visible after seek to 0. The thing is, unlike readdir(3) implementation, overlayfs keeps the readdir cache on the inode, so by keeping the original fd open and opening a new fd, I could reproduce stale and missing entries for a new opendir, which is definitely a bug. Will post the updated test. Thanks, Amir.
On Thu, Apr 22, 2021 at 10:53 AM Miklos Szeredi <miklos@szeredi.hu> wrote: > > On Thu, Apr 22, 2021 at 8:18 AM Amir Goldstein <amir73il@gmail.com> wrote: > > > > On Wed, Apr 21, 2021 at 12:23 PM Amir Goldstein <amir73il@gmail.com> wrote: > > > > > > Eryu, > > > > > > This extends the generic t_dir_offset2 test to verify > > > some more test cases and adds a new generic test which > > > passes on overlayfs (and other fs) on upstream kernel. > > > > > > The overlayfs specific test fails on upstream kernel > > > and the fix commit is currently in linux-next. > > > As usual, you may want to wait with merging until the fix > > > commit hits upstream. > > > > > > Miklos, > > > > > > I had noticed in the test full logs that readdir of > > > a merged dir behaves strangely - when seeking backwards > > > to offsets > 0, readdir returns unlinked entries in results. > > > The test does not fail on that behavior because the test > > > only asserts that this is not allowed after seek to offset 0. > > > > > > Knowing the implementation of overlayfs readdir cache this is > > > not surprising to me, but I wonder if this behavior is POSIX > > > compliant, and if not, whether we should document it and/or > > > add a failing test for it. > > > > > > > I think the outcome could be worse. > > If a copied up entry is unlinked after populating the dir cache > > but before ovl_cache_update_ino() then ovl_cache_update_ino() > > and subsequently the getdents call will fail with ENOENT. > > > > This test is not smart enough to cover this case (if it really exists). > > I think we need to relax the case of negative lookup result in > > ovl_cache_update_ino() and just set p->whiteout without and > > continue with no error. > > > > This can solve several test cases. > > In principle, we can "semi-reset" the merge dir cache if the dir was > > modified after every seek, not just seek to 0. > > By "semi-reset" I mean use the list, but force ovl_cache_update_ino() > > to all upper entries, similar to ovl_dir_read_impure(). > > > > OR.. we can just do that unconditionally in ovl_iterate(). > > The ovl dentry cache for the children will be populated after the first > > ovl_iterate() anyway, so maybe the penalty is not so bad? > > POSIX does allow stale readdir data (not just in case of non-zero seek): > > "If a file is removed from or added to the directory after the most > recent call to opendir() or rewinddir(), whether a subsequent call to > readdir() returns an entry for that file is unspecified." > > If you think about the way readdir(3) is implemented by the libc, this > is inevitable. > > Returning ENOENT from readdir(3) is obviously a bug. There is no ENOENT bug. I read the code in ovl_cache_update_ino() wrong, unless lookup_one_len() returns PTR_ERR(-ENOENT) instead of a negative dentry and I never understood if this ever happens, so the most we need to do beyond the fix already in overlayfs-next in to check that -ENOENT case. Thanks, Amir.