Message ID | 20221117115023.1350181-2-dwysocha@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/1] fscache: Fix oops due to race with cookie_lru and use_cookie | expand |
Dave Wysochanski <dwysocha@redhat.com> wrote: > If a cookie expires from the LRU and the LRU_DISCARD flag is set, > but the state machine has not run yet, it's possible another thread > can call fscache_use_cookie and begin to use it. When the > cookie_worker finally runs, it will see the LRU_DISCARD flag set, > transition the cookie->state to LRU_DISCARDING, which will then > withdraw the cookie. Once the cookie is withdrawn the object is > removed the below oops will occur because the object associated > with the cookie is now NULL. > > Fix the oops by clearing the LRU_DISCARD bit if another thread > uses the cookie before the cookie_worker runs. I think this is the right approach. The state machine should just fall through without doing anything, despite having been woken. David
You can probably make this easier to trigger by putting a delay in the state machine if the flag is set: [fs/fscache/cookie.c] +#include <linux/delay.h> ... static void fscache_cookie_state_machine(struct fscache_cookie *cookie) { enum fscache_cookie_state state; bool wake = false; _enter("c=%x", cookie->debug_id); again: + if (test_bit(FSCACHE_COOKIE_DO_LRU_DISCARD, &cookie->flags)) + msleep(100); again_locked: state = cookie->state; switch (state) { David
Dave Wysochanski <dwysocha@redhat.com> wrote:
> + clear_bit(FSCACHE_COOKIE_DO_LRU_DISCARD, &cookie->flags);
Actually, can you do test_and_clear_bit() and then log a trace point, say:
fscache_see_cookie(cookie, fscache_cookie_see_lru_discard_cancel);
if the bit was set.
David
On Thu, Nov 17, 2022 at 8:52 AM David Howells <dhowells@redhat.com> wrote: > > Dave Wysochanski <dwysocha@redhat.com> wrote: > > > + clear_bit(FSCACHE_COOKIE_DO_LRU_DISCARD, &cookie->flags); > > Actually, can you do test_and_clear_bit() and then log a trace point, say: > > fscache_see_cookie(cookie, fscache_cookie_see_lru_discard_cancel); > > if the bit was set. > > David > Ok sure. I will post a v2 with the trace point and the test_and_clear_bit.
diff --git a/fs/fscache/cookie.c b/fs/fscache/cookie.c index 451d8a077e12..a90c743fec79 100644 --- a/fs/fscache/cookie.c +++ b/fs/fscache/cookie.c @@ -605,6 +605,13 @@ void __fscache_use_cookie(struct fscache_cookie *cookie, bool will_modify) set_bit(FSCACHE_COOKIE_DO_PREP_TO_WRITE, &cookie->flags); queue = true; } + /* + * We could race with cookie_lru which may set LRU_DISCARD bit + * but has yet to run the cookie state machine. If this happens + * and another thread tries to use the cookie, clear LRU_DISCARD + * so we don't end up withdrawing the cookie while in use. + */ + clear_bit(FSCACHE_COOKIE_DO_LRU_DISCARD, &cookie->flags); break; case FSCACHE_COOKIE_STATE_FAILED:
If a cookie expires from the LRU and the LRU_DISCARD flag is set, but the state machine has not run yet, it's possible another thread can call fscache_use_cookie and begin to use it. When the cookie_worker finally runs, it will see the LRU_DISCARD flag set, transition the cookie->state to LRU_DISCARDING, which will then withdraw the cookie. Once the cookie is withdrawn the object is removed the below oops will occur because the object associated with the cookie is now NULL. Fix the oops by clearing the LRU_DISCARD bit if another thread uses the cookie before the cookie_worker runs. BUG: kernel NULL pointer dereference, address: 0000000000000008 ... CPU: 31 PID: 44773 Comm: kworker/u130:1 Tainted: G E 6.0.0-5.dneg.x86_64 #1 Hardware name: Google Compute Engine/Google Compute Engine, BIOS Google 08/26/2022 Workqueue: events_unbound netfs_rreq_write_to_cache_work [netfs] RIP: 0010:cachefiles_prepare_write+0x28/0x90 [cachefiles] ... Call Trace: netfs_rreq_write_to_cache_work+0x11c/0x320 [netfs] process_one_work+0x217/0x3e0 worker_thread+0x4a/0x3b0 ? process_one_work+0x3e0/0x3e0 kthread+0xd6/0x100 ? kthread_complete_and_exit+0x20/0x20 ret_from_fork+0x1f/0x30 Reported-by: Daire Byrne <daire.byrne@gmail.com> Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> --- fs/fscache/cookie.c | 7 +++++++ 1 file changed, 7 insertions(+)