Message ID | 20231002023344.GI3389589@ZenIV (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [01/15] rcu pathwalk: prevent bogus hard errors from may_lookup() | expand |
On 10/1/23 9:33 PM, Al Viro wrote: > in RCU mode we might race with gfs2_evict_inode(), which zeroes > ->i_gl. Freeing of the object it points to is RCU-delayed, so > if we manage to fetch the pointer before it's been replaced with > NULL, we are fine. Check if we'd fetched NULL and treat that > as "bail out and tell the caller to get out of RCU mode". > > Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> > --- > fs/gfs2/inode.c | 6 ++++-- > fs/gfs2/super.c | 2 +- > 2 files changed, 5 insertions(+), 3 deletions(-) > > diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c > index 0eac04507904..e2432c327599 100644 > --- a/fs/gfs2/inode.c > +++ b/fs/gfs2/inode.c > @@ -1868,14 +1868,16 @@ int gfs2_permission(struct mnt_idmap *idmap, struct inode *inode, > { > struct gfs2_inode *ip; > struct gfs2_holder i_gh; > + struct gfs2_glock *gl; > int error; > > gfs2_holder_mark_uninitialized(&i_gh); > ip = GFS2_I(inode); > - if (gfs2_glock_is_locked_by_me(ip->i_gl) == NULL) { > + gl = rcu_dereference(ip->i_gl); > + if (!gl || gfs2_glock_is_locked_by_me(gl) == NULL) { This looks wrong. It should be if (gl && ... otherwise the gfs2_glock_nq_init will dereference the null pointer. Bob Peterson
On Mon, Oct 02, 2023 at 06:46:03AM -0500, Bob Peterson wrote: > > diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c > > index 0eac04507904..e2432c327599 100644 > > --- a/fs/gfs2/inode.c > > +++ b/fs/gfs2/inode.c > > @@ -1868,14 +1868,16 @@ int gfs2_permission(struct mnt_idmap *idmap, struct inode *inode, > > { > > struct gfs2_inode *ip; > > struct gfs2_holder i_gh; > > + struct gfs2_glock *gl; > > int error; > > gfs2_holder_mark_uninitialized(&i_gh); > > ip = GFS2_I(inode); > > - if (gfs2_glock_is_locked_by_me(ip->i_gl) == NULL) { > > + gl = rcu_dereference(ip->i_gl); > > + if (!gl || gfs2_glock_is_locked_by_me(gl) == NULL) { > > This looks wrong. It should be if (gl && ... otherwise the > gfs2_glock_nq_init will dereference the null pointer. We shouldn't observe NULL ->i_gl unless we are in RCU mode, which means we'll bail out without reaching gfs2_glock_nq_init()...
On Mon, Oct 02, 2023 at 01:59:46PM +0100, Al Viro wrote: > On Mon, Oct 02, 2023 at 06:46:03AM -0500, Bob Peterson wrote: > > > diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c > > > index 0eac04507904..e2432c327599 100644 > > > --- a/fs/gfs2/inode.c > > > +++ b/fs/gfs2/inode.c > > > @@ -1868,14 +1868,16 @@ int gfs2_permission(struct mnt_idmap *idmap, struct inode *inode, > > > { > > > struct gfs2_inode *ip; > > > struct gfs2_holder i_gh; > > > + struct gfs2_glock *gl; > > > int error; > > > gfs2_holder_mark_uninitialized(&i_gh); > > > ip = GFS2_I(inode); > > > - if (gfs2_glock_is_locked_by_me(ip->i_gl) == NULL) { > > > + gl = rcu_dereference(ip->i_gl); > > > + if (!gl || gfs2_glock_is_locked_by_me(gl) == NULL) { > > > > This looks wrong. It should be if (gl && ... otherwise the > > gfs2_glock_nq_init will dereference the null pointer. > > We shouldn't observe NULL ->i_gl unless we are in RCU mode, > which means we'll bail out without reaching gfs2_glock_nq_init()... Something like if (unlikely(!gl)) { /* inode is getting torn down, must be RCU mode */ WARN_ON_ONCE(!(mask & MAY_NOT_BLOCK)); return -ECHILD; } might be less confusing way to express that...
Am Mo., 2. Okt. 2023 um 19:09 Uhr schrieb Al Viro <viro@zeniv.linux.org.uk>: > On Mon, Oct 02, 2023 at 01:59:46PM +0100, Al Viro wrote: > > On Mon, Oct 02, 2023 at 06:46:03AM -0500, Bob Peterson wrote: > > > > diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c > > > > index 0eac04507904..e2432c327599 100644 > > > > --- a/fs/gfs2/inode.c > > > > +++ b/fs/gfs2/inode.c > > > > @@ -1868,14 +1868,16 @@ int gfs2_permission(struct mnt_idmap *idmap, struct inode *inode, > > > > { > > > > struct gfs2_inode *ip; > > > > struct gfs2_holder i_gh; > > > > + struct gfs2_glock *gl; > > > > int error; > > > > gfs2_holder_mark_uninitialized(&i_gh); > > > > ip = GFS2_I(inode); > > > > - if (gfs2_glock_is_locked_by_me(ip->i_gl) == NULL) { > > > > + gl = rcu_dereference(ip->i_gl); > > > > + if (!gl || gfs2_glock_is_locked_by_me(gl) == NULL) { > > > > > > This looks wrong. It should be if (gl && ... otherwise the > > > gfs2_glock_nq_init will dereference the null pointer. > > > > We shouldn't observe NULL ->i_gl unless we are in RCU mode, > > which means we'll bail out without reaching gfs2_glock_nq_init()... > > Something like > if (unlikely(!gl)) { > /* inode is getting torn down, must be RCU mode */ > WARN_ON_ONCE(!(mask & MAY_NOT_BLOCK)); > return -ECHILD; > } > might be less confusing way to express that... Looking good, thanks. I'll queue it up. Could you please send such fixes to the filesystem-specific list in the future (scripts/get_maintainer.pl)? Thanks, Andreas
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c index 0eac04507904..e2432c327599 100644 --- a/fs/gfs2/inode.c +++ b/fs/gfs2/inode.c @@ -1868,14 +1868,16 @@ int gfs2_permission(struct mnt_idmap *idmap, struct inode *inode, { struct gfs2_inode *ip; struct gfs2_holder i_gh; + struct gfs2_glock *gl; int error; gfs2_holder_mark_uninitialized(&i_gh); ip = GFS2_I(inode); - if (gfs2_glock_is_locked_by_me(ip->i_gl) == NULL) { + gl = rcu_dereference(ip->i_gl); + if (!gl || gfs2_glock_is_locked_by_me(gl) == NULL) { if (mask & MAY_NOT_BLOCK) return -ECHILD; - error = gfs2_glock_nq_init(ip->i_gl, LM_ST_SHARED, LM_FLAG_ANY, &i_gh); + error = gfs2_glock_nq_init(gl, LM_ST_SHARED, LM_FLAG_ANY, &i_gh); if (error) return error; } diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c index 02d93da21b2b..0dd5641990b9 100644 --- a/fs/gfs2/super.c +++ b/fs/gfs2/super.c @@ -1550,7 +1550,7 @@ static void gfs2_evict_inode(struct inode *inode) wait_on_bit_io(&ip->i_flags, GIF_GLOP_PENDING, TASK_UNINTERRUPTIBLE); gfs2_glock_add_to_lru(ip->i_gl); gfs2_glock_put_eventually(ip->i_gl); - ip->i_gl = NULL; + rcu_assign_pointer(ip->i_gl, NULL); } }
in RCU mode we might race with gfs2_evict_inode(), which zeroes ->i_gl. Freeing of the object it points to is RCU-delayed, so if we manage to fetch the pointer before it's been replaced with NULL, we are fine. Check if we'd fetched NULL and treat that as "bail out and tell the caller to get out of RCU mode". Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> --- fs/gfs2/inode.c | 6 ++++-- fs/gfs2/super.c | 2 +- 2 files changed, 5 insertions(+), 3 deletions(-)