Message ID | 75f4472c-b9e2-6353-3af0-c4939ecfca41@virtuozzo.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, 2017-11-13 at 07:25 +0300, Vasily Averin wrote: > restart_grace() uses hardcoded init_net. > It can cause to "list_add double add" in following scenario: > > 1) nfsd and lockd was started in several net namespaces > 2) nfsd in init_net was stopped (lockd was not stopped because > it have users from another net namespaces) > 3) lockd got signal, called restart_grace() -> set_grace_period() > and enabled lock_manager in hardcoded init_net. > 4) nfsd in init_net is started again, > its lockd_up() calls set_grace_period() and tries to add > lock_manager into init_net 2nd time. > > Jeff Layton suggest: > "Make it safe to call locks_start_grace multiple times on the same > lock_manager. If it's already on the global grace_list, then don't try > to add it again. > > With this change, we also need to ensure that the nfsd4 lock manager > initializes the list before we call locks_start_grace. While we're at > it, move the rest of the nfsd_net initialization into > nfs4_state_create_net. I see no reason to have it spread over two > functions like it is today." > > Suggested patch was updated to generate warning in described situation. > > Suggested-by: Jeff Layton <jlayton@redhat.com> > Signed-off-by: Vasily Averin <vvs@virtuozzo.com> > --- > fs/nfs_common/grace.c | 6 +++++- > fs/nfsd/nfs4state.c | 7 ++++--- > 2 files changed, 9 insertions(+), 4 deletions(-) > > diff --git a/fs/nfs_common/grace.c b/fs/nfs_common/grace.c > index bd3e2d3..5be08f0 100644 > --- a/fs/nfs_common/grace.c > +++ b/fs/nfs_common/grace.c > @@ -30,7 +30,11 @@ locks_start_grace(struct net *net, struct lock_manager *lm) > struct list_head *grace_list = net_generic(net, grace_net_id); > > spin_lock(&grace_lock); > - list_add(&lm->list, grace_list); > + if (list_empty(&lm->list)) > + list_add(&lm->list, grace_list); > + else > + WARN(1, "double list_add attempt detected in net %x %s\n", > + net->ns.inum, (net == &init_net) ? "(init_net)" : ""); > spin_unlock(&grace_lock); > } I'm not sure that warning really means much. It's not _really_ a bug to request that a new grace period start while it's already in one. In general, it's ok to request a new grace period while it's currently enforcing one. That should just have the effect of extending the existing grace period. > EXPORT_SYMBOL_GPL(locks_start_grace); > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c > index 7345143..b29b5a1 100644 > --- a/fs/nfsd/nfs4state.c > +++ b/fs/nfsd/nfs4state.c > @@ -7103,6 +7103,10 @@ static int nfs4_state_create_net(struct net *net) > INIT_LIST_HEAD(&nn->sessionid_hashtbl[i]); > nn->conf_name_tree = RB_ROOT; > nn->unconf_name_tree = RB_ROOT; > + nn->boot_time = get_seconds(); > + nn->grace_ended = false; > + nn->nfsd4_manager.block_opens = true; > + INIT_LIST_HEAD(&nn->nfsd4_manager.list); > INIT_LIST_HEAD(&nn->client_lru); > INIT_LIST_HEAD(&nn->close_lru); > INIT_LIST_HEAD(&nn->del_recall_lru); > @@ -7160,9 +7164,6 @@ nfs4_state_start_net(struct net *net) > ret = nfs4_state_create_net(net); > if (ret) > return ret; > - nn->boot_time = get_seconds(); > - nn->grace_ended = false; > - nn->nfsd4_manager.block_opens = true; > locks_start_grace(net, &nn->nfsd4_manager); > nfsd4_client_tracking_init(net); > printk(KERN_INFO "NFSD: starting %ld-second grace period (net %x)\n",
On 2017-11-13 14:49, Jeff Layton wrote: > On Mon, 2017-11-13 at 07:25 +0300, Vasily Averin wrote: >> --- a/fs/nfs_common/grace.c >> +++ b/fs/nfs_common/grace.c >> @@ -30,7 +30,11 @@ locks_start_grace(struct net *net, struct lock_manager *lm) >> struct list_head *grace_list = net_generic(net, grace_net_id); >> >> spin_lock(&grace_lock); >> - list_add(&lm->list, grace_list); >> + if (list_empty(&lm->list)) >> + list_add(&lm->list, grace_list); >> + else >> + WARN(1, "double list_add attempt detected in net %x %s\n", >> + net->ns.inum, (net == &init_net) ? "(init_net)" : ""); >> spin_unlock(&grace_lock); >> } > > I'm not sure that warning really means much. > > It's not _really_ a bug to request that a new grace period start while > it's already in one. In general, it's ok to request a new grace period > while it's currently enforcing one. That should just have the effect of > extending the existing grace period. "double list_add" can happen in init_net when legacy signal in lockd was used. It should not happen during usual extending of existing grace period, because restart_grace() calls locks_end_grace() before set_grace_period() but it can race with start of lockd_up_net() in init_net. I'm agree: we do not have any bugs in this scenario, all should work correctly. However I would like to keep WARN to properly detect lost locks_end_grace()/ cancel_delayed_work(). If you worry about real false positive and do not worry about abstract future troubles in init_net, I can move WARN under (net != &init_net) check. However I would like to keep this warning here. On the other hand if you disagree and still believe that WARN is not required here I'm ready to agree with your original patch version. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 2017-11-13 at 17:57 +0300, Vasily Averin wrote: > On 2017-11-13 14:49, Jeff Layton wrote: > > On Mon, 2017-11-13 at 07:25 +0300, Vasily Averin wrote: > > > --- a/fs/nfs_common/grace.c > > > +++ b/fs/nfs_common/grace.c > > > @@ -30,7 +30,11 @@ locks_start_grace(struct net *net, struct lock_manager *lm) > > > struct list_head *grace_list = net_generic(net, grace_net_id); > > > > > > spin_lock(&grace_lock); > > > - list_add(&lm->list, grace_list); > > > + if (list_empty(&lm->list)) > > > + list_add(&lm->list, grace_list); > > > + else > > > + WARN(1, "double list_add attempt detected in net %x %s\n", > > > + net->ns.inum, (net == &init_net) ? "(init_net)" : ""); > > > spin_unlock(&grace_lock); > > > } > > > > I'm not sure that warning really means much. > > > > It's not _really_ a bug to request that a new grace period start while > > it's already in one. In general, it's ok to request a new grace period > > while it's currently enforcing one. That should just have the effect of > > extending the existing grace period. > > "double list_add" can happen in init_net when legacy signal in lockd was used. > It should not happen during usual extending of existing grace period, > because restart_grace() calls locks_end_grace() before set_grace_period() > but it can race with start of lockd_up_net() in init_net. > I'm agree: we do not have any bugs in this scenario, all should work correctly. > > However I would like to keep WARN to properly detect lost locks_end_grace()/ > cancel_delayed_work(). > > If you worry about real false positive and do not worry about abstract > future troubles in init_net, I can move WARN under (net != &init_net) check. > > However I would like to keep this warning here. > > On the other hand if you disagree and still believe that WARN is not required here > I'm ready to agree with your original patch version. Fair enough. I don't feel strongly about it. I just have been doing some investigation lately into clustered grace period management, so it's a little on my mind. [1] For now though, you're certainly correct that we'll never attempt to set the grace period while we're already in it. If we ever want to do more complex grace period handling in the kernel, we may need to drop that WARN, however. [1]: https://jtlayton.wordpress.com/2017/11/07/active-active-nfs-over-cephfs/
On Mon, Nov 13, 2017 at 03:06:17PM -0500, Jeff Layton wrote: > On Mon, 2017-11-13 at 17:57 +0300, Vasily Averin wrote: > > On 2017-11-13 14:49, Jeff Layton wrote: > > > On Mon, 2017-11-13 at 07:25 +0300, Vasily Averin wrote: > > > > --- a/fs/nfs_common/grace.c > > > > +++ b/fs/nfs_common/grace.c > > > > @@ -30,7 +30,11 @@ locks_start_grace(struct net *net, struct lock_manager *lm) > > > > struct list_head *grace_list = net_generic(net, grace_net_id); > > > > > > > > spin_lock(&grace_lock); > > > > - list_add(&lm->list, grace_list); > > > > + if (list_empty(&lm->list)) > > > > + list_add(&lm->list, grace_list); > > > > + else > > > > + WARN(1, "double list_add attempt detected in net %x %s\n", > > > > + net->ns.inum, (net == &init_net) ? "(init_net)" : ""); > > > > spin_unlock(&grace_lock); > > > > } > > > > > > I'm not sure that warning really means much. > > > > > > It's not _really_ a bug to request that a new grace period start while > > > it's already in one. In general, it's ok to request a new grace period > > > while it's currently enforcing one. That should just have the effect of > > > extending the existing grace period. > > > > "double list_add" can happen in init_net when legacy signal in lockd was used. > > It should not happen during usual extending of existing grace period, > > because restart_grace() calls locks_end_grace() before set_grace_period() > > but it can race with start of lockd_up_net() in init_net. > > I'm agree: we do not have any bugs in this scenario, all should work correctly. > > > > However I would like to keep WARN to properly detect lost locks_end_grace()/ > > cancel_delayed_work(). > > > > If you worry about real false positive and do not worry about abstract > > future troubles in init_net, I can move WARN under (net != &init_net) check. > > > > However I would like to keep this warning here. > > > > On the other hand if you disagree and still believe that WARN is not required here > > I'm ready to agree with your original patch version. > > Fair enough. I don't feel strongly about it. I just have been doing some > investigation lately into clustered grace period management, so it's a > little on my mind. [1] > > For now though, you're certainly correct that we'll never attempt to set > the grace period while we're already in it. If we ever want to do more > complex grace period handling in the kernel, we may need to drop that > WARN, however. OK, applied with a minor changelog update. Vasily, if you see anything missing from nfsd-next at this point, let me know: git://linux-nfs.org/~bfields/linux.git nfsd-next --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/nfs_common/grace.c b/fs/nfs_common/grace.c index bd3e2d3..5be08f0 100644 --- a/fs/nfs_common/grace.c +++ b/fs/nfs_common/grace.c @@ -30,7 +30,11 @@ locks_start_grace(struct net *net, struct lock_manager *lm) struct list_head *grace_list = net_generic(net, grace_net_id); spin_lock(&grace_lock); - list_add(&lm->list, grace_list); + if (list_empty(&lm->list)) + list_add(&lm->list, grace_list); + else + WARN(1, "double list_add attempt detected in net %x %s\n", + net->ns.inum, (net == &init_net) ? "(init_net)" : ""); spin_unlock(&grace_lock); } EXPORT_SYMBOL_GPL(locks_start_grace); diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c index 7345143..b29b5a1 100644 --- a/fs/nfsd/nfs4state.c +++ b/fs/nfsd/nfs4state.c @@ -7103,6 +7103,10 @@ static int nfs4_state_create_net(struct net *net) INIT_LIST_HEAD(&nn->sessionid_hashtbl[i]); nn->conf_name_tree = RB_ROOT; nn->unconf_name_tree = RB_ROOT; + nn->boot_time = get_seconds(); + nn->grace_ended = false; + nn->nfsd4_manager.block_opens = true; + INIT_LIST_HEAD(&nn->nfsd4_manager.list); INIT_LIST_HEAD(&nn->client_lru); INIT_LIST_HEAD(&nn->close_lru); INIT_LIST_HEAD(&nn->del_recall_lru); @@ -7160,9 +7164,6 @@ nfs4_state_start_net(struct net *net) ret = nfs4_state_create_net(net); if (ret) return ret; - nn->boot_time = get_seconds(); - nn->grace_ended = false; - nn->nfsd4_manager.block_opens = true; locks_start_grace(net, &nn->nfsd4_manager); nfsd4_client_tracking_init(net); printk(KERN_INFO "NFSD: starting %ld-second grace period (net %x)\n",
restart_grace() uses hardcoded init_net. It can cause to "list_add double add" in following scenario: 1) nfsd and lockd was started in several net namespaces 2) nfsd in init_net was stopped (lockd was not stopped because it have users from another net namespaces) 3) lockd got signal, called restart_grace() -> set_grace_period() and enabled lock_manager in hardcoded init_net. 4) nfsd in init_net is started again, its lockd_up() calls set_grace_period() and tries to add lock_manager into init_net 2nd time. Jeff Layton suggest: "Make it safe to call locks_start_grace multiple times on the same lock_manager. If it's already on the global grace_list, then don't try to add it again. With this change, we also need to ensure that the nfsd4 lock manager initializes the list before we call locks_start_grace. While we're at it, move the rest of the nfsd_net initialization into nfs4_state_create_net. I see no reason to have it spread over two functions like it is today." Suggested patch was updated to generate warning in described situation. Suggested-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Vasily Averin <vvs@virtuozzo.com> --- fs/nfs_common/grace.c | 6 +++++- fs/nfsd/nfs4state.c | 7 ++++--- 2 files changed, 9 insertions(+), 4 deletions(-)