Message ID | 20210425071445.29547-1-amir73il@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Sun, Apr 25, 2021 at 9:14 AM Amir Goldstein <amir73il@gmail.com> wrote: > > Eryu, > > This extends the generic t_dir_offset2 helper program to verify > some cases of missing/stale entries and adds a new generic test which > passes on overlayfs (and other fs) on upstream kernel. > > The overlayfs specific test fails on upstream kernel and the fix commit > is currently in linux-next. As usual, you may want to wait with merging > until the fix commit hits upstream. > > Based on feedback from Miklos, I changed the test to check for the > missing/stale entries on a new fd, while old fd is kept open, because > POSIX allows for stale/missing entries in the old fd. > > I was looking into another speculated bug in overlayfs which involves > multiple calls to getdents. Although it turned out that overlayfs does > not have the speculated bug, I left both generic and overlay test with > multiple calls to getdents in order to excersize the relevant code. > > The attached patch was used to verify that the overlayfs test excercises > the call to ovl_cache_update_ino() with stale entries. > Overlayfs populates the merge dir readdir cache with a list of files in > the first getdents call, but updates d_ino of files on the list in > subsequent getdents calls. By that time, the last entry is stale and the > following warning is printed (on linux-next with patch below applied): > [ ] overlayfs: failed to look up (m100) for ino (0) > [ ] overlayfs: failed to look up (f100) for ino (0) > > Miklos, > > Do you think it is worth the trouble to set p->is_whiteout and skip > dir_emit() in this case? and do we need to worry about lookup_one_len() > returning -ENOENT in this case? So lookup_one_len() first does a cached lookup, and if found returns that. If not then it calls the filesystem's ->lookup() callback, which in this case is ovl_lookup(). AFAICS ovl_lookup() will never return ENOENT, even if the underlying filesystem does. Which means it's not necessary to worry about that case. The other case you found it that in case of a stale direntry the i_ino update will be skipped and so it will return an inconsistent result, right? Fixing that looks worthwhile, yes. Thanks, Miklos
On Mon, Apr 26, 2021 at 1:07 PM Miklos Szeredi <miklos@szeredi.hu> wrote: > > On Sun, Apr 25, 2021 at 9:14 AM Amir Goldstein <amir73il@gmail.com> wrote: > > > > Eryu, > > > > This extends the generic t_dir_offset2 helper program to verify > > some cases of missing/stale entries and adds a new generic test which > > passes on overlayfs (and other fs) on upstream kernel. > > > > The overlayfs specific test fails on upstream kernel and the fix commit > > is currently in linux-next. As usual, you may want to wait with merging > > until the fix commit hits upstream. > > > > Based on feedback from Miklos, I changed the test to check for the > > missing/stale entries on a new fd, while old fd is kept open, because > > POSIX allows for stale/missing entries in the old fd. > > > > I was looking into another speculated bug in overlayfs which involves > > multiple calls to getdents. Although it turned out that overlayfs does > > not have the speculated bug, I left both generic and overlay test with > > multiple calls to getdents in order to excersize the relevant code. > > > > The attached patch was used to verify that the overlayfs test excercises > > the call to ovl_cache_update_ino() with stale entries. > > Overlayfs populates the merge dir readdir cache with a list of files in > > the first getdents call, but updates d_ino of files on the list in > > subsequent getdents calls. By that time, the last entry is stale and the > > following warning is printed (on linux-next with patch below applied): > > [ ] overlayfs: failed to look up (m100) for ino (0) > > [ ] overlayfs: failed to look up (f100) for ino (0) > > > > Miklos, > > > > Do you think it is worth the trouble to set p->is_whiteout and skip > > dir_emit() in this case? and do we need to worry about lookup_one_len() > > returning -ENOENT in this case? > > So lookup_one_len() first does a cached lookup, and if found returns > that. If not then it calls the filesystem's ->lookup() callback, > which in this case is ovl_lookup(). AFAICS ovl_lookup() will never > return ENOENT, even if the underlying filesystem does. > > Which means it's not necessary to worry about that case. > > The other case you found it that in case of a stale direntry the i_ino > update will be skipped and so it will return an inconsistent result, > right? Right. It returns a stale entry with the old real ino. Not sure if that is an "inconsistent" result. inconsistent w.r.t what? > Fixing that looks worthwhile, yes. > Will look into it. Thanks, Amir.
On Mon, Apr 26, 2021 at 12:15 PM Amir Goldstein <amir73il@gmail.com> wrote: > > On Mon, Apr 26, 2021 at 1:07 PM Miklos Szeredi <miklos@szeredi.hu> wrote: > > The other case you found it that in case of a stale direntry the i_ino > > update will be skipped and so it will return an inconsistent result, > > right? > > Right. It returns a stale entry with the old real ino. > Not sure if that is an "inconsistent" result. > inconsistent w.r.t what? It's inconsistent with previous (before the entry got deleted) st_ino/i_ino. This should actually be testable. But it was a long time ago that I fully understood the readdir code, so I might be missing something. Thanks, Miklos
On Mon, Apr 26, 2021 at 4:13 PM Miklos Szeredi <miklos@szeredi.hu> wrote: > > On Mon, Apr 26, 2021 at 12:15 PM Amir Goldstein <amir73il@gmail.com> wrote: > > > > On Mon, Apr 26, 2021 at 1:07 PM Miklos Szeredi <miklos@szeredi.hu> wrote: > > > > The other case you found it that in case of a stale direntry the i_ino > > > update will be skipped and so it will return an inconsistent result, > > > right? > > > > Right. It returns a stale entry with the old real ino. > > Not sure if that is an "inconsistent" result. > > inconsistent w.r.t what? > > It's inconsistent with previous (before the entry got deleted) > st_ino/i_ino. This should actually be testable. Right. it is testable: QA output created by 077 +entry m100 has inconsistent d_ino (266 != 264) +entry f100 has inconsistent d_ino (367 != 16777542) Silence is golden These prints are from the iteration on the first fd. The first fd lists the stale entry with inconsistent d_ino. The second fd does not list the stale entry (with bugfix in linux-next). Will add it to the test in V3. Attached patch fixes this problem. Thanks, Amir. From 304e1599cc112fae388d0c0b2aaabdf385032226 Mon Sep 17 00:00:00 2001 From: Amir Goldstein <amir73il@gmail.com> Date: Mon, 26 Apr 2021 18:07:09 +0300 Subject: [PATCH] ovl: skip stale entries in merge dir cache iteration On the first getdents call, ovl_iterate() populates the readdir cache with a list of entries, but for upper entries with origin lower inode, p->ino remains zero. Following getdents calls traverse the readdir cache list and call ovl_cache_update_ino() for entries with zero p->ino to lookup the entry in the overlay and return d_ino that is consistent with st_ino. If the upper file was unlinked between the first getdents call and the getdents call that lists the file entry, ovl_cache_update_ino() will not find the entry and fall back to setting d_ino to the upper real st_ino, which is inconsistent with how this object was presented to users. Instead of listing a stale entry with inconsistent d_ino, simply skip the stale entry, which is better for users. Signed-off-by: Amir Goldstein <amir73il@gmail.com> --- fs/overlayfs/readdir.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c index cc1e80257064..10b7780e4bdc 100644 --- a/fs/overlayfs/readdir.c +++ b/fs/overlayfs/readdir.c @@ -481,6 +481,8 @@ static int ovl_cache_update_ino(struct path *path, struct ovl_cache_entry *p) } this = lookup_one_len(p->name, dir, p->len); if (IS_ERR_OR_NULL(this) || !this->d_inode) { + /* Mark a stale entry */ + p->is_whiteout = true; if (IS_ERR(this)) { err = PTR_ERR(this); this = NULL; @@ -776,6 +778,9 @@ static int ovl_iterate(struct file *file, struct dir_context *ctx) if (err) goto out; } + } + /* ovl_cache_update_ino() sets is_whiteout on stale entry */ + if (!p->is_whiteout) { if (!dir_emit(ctx, p->name, p->len, p->ino, p->type)) break; }
diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c index cc1e80257064..cadcbfafa179 100644 --- a/fs/overlayfs/readdir.c +++ b/fs/overlayfs/readdir.c @@ -486,7 +486,7 @@ static int ovl_cache_update_ino(struct path *path, struct ovl_cache_entry *p) this = NULL; goto fail; } - goto out; + goto fail; } get: