Message ID | 20231109062056.3181775-7-viro@zeniv.linux.org.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [01/22] struct dentry: get rid of randomize_layout idiocy | expand |
On Thu, Nov 09, 2023 at 06:20:41AM +0000, Al Viro wrote: > ... we won't see DCACHE_MAY_FREE on anything that is *not* dead > and checking d_flags is just as cheap as checking refcount. > > Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> > --- Could also be a WARN_ON_ONCE() on d_lockref.count > 0 if DCACHE_MAY_FREE is set but probably doesn't matter, Reviewed-by: Christian Brauner <brauner@kernel.org>
On Thu, Nov 09, 2023 at 02:53:04PM +0100, Christian Brauner wrote: > On Thu, Nov 09, 2023 at 06:20:41AM +0000, Al Viro wrote: > > ... we won't see DCACHE_MAY_FREE on anything that is *not* dead > > and checking d_flags is just as cheap as checking refcount. > > > > Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> > > --- > > Could also be a WARN_ON_ONCE() on d_lockref.count > 0 if DCACHE_MAY_FREE > is set but probably doesn't matter, >= 0, actually, but... TBH, in longer run I would rather represent the empty husk state (instance just waiting for shrink_dentry_list() to remove it from its list and free the sucker) not by a bit in ->d_flags, but by a specific negative value in ->d_lockref.count. After this series we have the following picture: all real instances come from __alloc_dentry(). Possible states after that Busy <-> Retained -> Dying -> Freeing | ^ V | Husk ----/ Busy and Retained are live dentries, with positive and zero refcount resp.; that's the pure refcounting land. Eventually we get to succesful lock_for_kill(), which leads to call of __dentry_kill(). That's where the state becomes Dying. On the way out of __dentry_kill() (after ->d_prune()/->d_iput()/->d_release()) we either switch to Freeing (only RCU references remain, actual memory object freed by the end of it) or Husk (the only non-RCU reference is that of a shrink list it's on). Husk, in turn, switches to Freeing as soon as shrink_dentry_list() gets around to it and takes it out of its shrink list. If shrink_dentry_list() picks an instance in Dying state, it quietly removes it from the shrink list and leaves it for __dentry_kill() to deal with. All transitions are under ->d_lock. ->d_lockref.count for those is positive in Busy, zero in Retained and -128 in Dying, Husk and Freeing. Husk is distinguished by having DCACHE_MAY_FREE set. Freeing has no visible difference from Dying. All refcount changes are under ->d_lock. None of them should _ever_ change the negative values. If the last part is easy to verify (right now it's up to "no refcount overflows, all callers of dget_dlock() are guaranteed to be dealing with Busy or Retained instances"), it might make sense to use 3 specific negative values for Dying/Husk/Freeing. What's more, it might make sense to deal with overflows by adding a separate unsigned long __d_large_count_dont_you_ever_touch_that_directly; and have the overflow switch to the 4th special negative number indicating that real counter sits in there. I'm not 100% convinced that this is the right way to handle that mess, but it's an approach I'm at least keeping in mind. Anyway, we need to get the damn thing documented and understandable before dealing with overflows becomes even remotely possible. As it is, it's way too subtle and reasoning about correctness is too convoluted and brittle. PS: "documented" includes explicit description of states, their representations and transitions between them, as well as the objects associated with the instance in each of those, what references are allowed in each state, etc. And the things like in-lookup, cursor, etc. - live dentries have sub-states as well...
diff --git a/fs/dcache.c b/fs/dcache.c index 1476f2d6e9ea..5371f32eb4bb 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -1186,11 +1186,10 @@ void shrink_dentry_list(struct list_head *list) spin_lock(&dentry->d_lock); rcu_read_lock(); if (!shrink_lock_dentry(dentry)) { - bool can_free = false; + bool can_free; rcu_read_unlock(); d_shrink_del(dentry); - if (dentry->d_lockref.count < 0) - can_free = dentry->d_flags & DCACHE_MAY_FREE; + can_free = dentry->d_flags & DCACHE_MAY_FREE; spin_unlock(&dentry->d_lock); if (can_free) dentry_free(dentry);
... we won't see DCACHE_MAY_FREE on anything that is *not* dead and checking d_flags is just as cheap as checking refcount. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> --- fs/dcache.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)