Message ID | 1539543498-29105-17-git-send-email-jsimmons@infradead.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | lustre: more assorted fixes for lustre 2.10 | expand |
On Sun, Oct 14 2018, James Simmons wrote: > From: Lai Siyao <lai.siyao@whamcloud.com> > > A barrier is missing before wake_up() in ll_statahead_interpret(), > which may cause 'ls' hang. Under the right conditions a basic 'ls' > can fail. The debug logs show: > > statahead.c:683:ll_statahead_interpret()) sa_entry software rc -13 > statahead.c:1666:ll_statahead()) revalidate statahead software: -11. > > Obviously statahead failure didn't notify 'ls' process in time. > The mi_cbdata can be stale so add a barrier before calling > wake_up(). > > Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com> > Signed-off-by: Bob Glossman <bob.glossman@intel.com> > WC-bug-id: https://jira.whamcloud.com/browse/LU-9210 > Reviewed-on: https://review.whamcloud.com/27330 > Reviewed-by: Nathaniel Clark <nclark@whamcloud.com> > Reviewed-by: Oleg Drokin <green@whamcloud.com> > Signed-off-by: James Simmons <jsimmons@infradead.org> > --- > drivers/staging/lustre/lustre/llite/statahead.c | 8 +++++++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/drivers/staging/lustre/lustre/llite/statahead.c b/drivers/staging/lustre/lustre/llite/statahead.c > index 1ad308c..0174a4c 100644 > --- a/drivers/staging/lustre/lustre/llite/statahead.c > +++ b/drivers/staging/lustre/lustre/llite/statahead.c > @@ -680,8 +680,14 @@ static int ll_statahead_interpret(struct ptlrpc_request *req, > > spin_lock(&lli->lli_sa_lock); > if (rc) { > - if (__sa_make_ready(sai, entry, rc)) > + if (__sa_make_ready(sai, entry, rc)) { > + /* LU-9210 : Under the right conditions even 'ls' > + * can cause the statahead to fail. Using a memory > + * barrier resolves this issue. > + */ > + smp_mb(); > wake_up(&sai->sai_waitq); > + } > } else { > int first = 0; > entry->se_minfo = minfo; > -- > 1.8.3.1 Again, this is a fairly lame comment to justify the smp_mb(). It appears to me that the issue is most likely the value of entry->se_state. __sa_make_ready() sets this and revalidate_statahead_dentry tests it after waiting on sai_waitq. So I think it would be best if we changed __sa_make_ready() to smp_store_release(&entry->se_state, ret < 0 ? SA_ENTRY_INVA : SA_ENTRY_SUCC) and in ll_statahead_interpret() have if (smp_load_acquire(&entry->se_state) == SA_ENTRY_SUCC && entry->se_inode) { This would make it obvious which variable was important, and would show the paired synchronization points. NeilBrown
> On Sun, Oct 14 2018, James Simmons wrote: > > > From: Lai Siyao <lai.siyao@whamcloud.com> > > > > A barrier is missing before wake_up() in ll_statahead_interpret(), > > which may cause 'ls' hang. Under the right conditions a basic 'ls' > > can fail. The debug logs show: > > > > statahead.c:683:ll_statahead_interpret()) sa_entry software rc -13 > > statahead.c:1666:ll_statahead()) revalidate statahead software: -11. > > > > Obviously statahead failure didn't notify 'ls' process in time. > > The mi_cbdata can be stale so add a barrier before calling > > wake_up(). > > > > Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com> > > Signed-off-by: Bob Glossman <bob.glossman@intel.com> > > WC-bug-id: https://jira.whamcloud.com/browse/LU-9210 > > Reviewed-on: https://review.whamcloud.com/27330 > > Reviewed-by: Nathaniel Clark <nclark@whamcloud.com> > > Reviewed-by: Oleg Drokin <green@whamcloud.com> > > Signed-off-by: James Simmons <jsimmons@infradead.org> > > --- > > drivers/staging/lustre/lustre/llite/statahead.c | 8 +++++++- > > 1 file changed, 7 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/staging/lustre/lustre/llite/statahead.c b/drivers/staging/lustre/lustre/llite/statahead.c > > index 1ad308c..0174a4c 100644 > > --- a/drivers/staging/lustre/lustre/llite/statahead.c > > +++ b/drivers/staging/lustre/lustre/llite/statahead.c > > @@ -680,8 +680,14 @@ static int ll_statahead_interpret(struct ptlrpc_request *req, > > > > spin_lock(&lli->lli_sa_lock); > > if (rc) { > > - if (__sa_make_ready(sai, entry, rc)) > > + if (__sa_make_ready(sai, entry, rc)) { > > + /* LU-9210 : Under the right conditions even 'ls' > > + * can cause the statahead to fail. Using a memory > > + * barrier resolves this issue. > > + */ > > + smp_mb(); > > wake_up(&sai->sai_waitq); > > + } > > } else { > > int first = 0; > > entry->se_minfo = minfo; > > -- > > 1.8.3.1 > > Again, this is a fairly lame comment to justify the smp_mb(). > It appears to me that the issue is most likely the value of > entry->se_state. > __sa_make_ready() sets this and revalidate_statahead_dentry tests it > after waiting on sai_waitq. > So I think it would be best if we changed __sa_make_ready() to > > smp_store_release(&entry->se_state, ret < 0 ? SA_ENTRY_INVA : SA_ENTRY_SUCC) > > and in ll_statahead_interpret() have > > if (smp_load_acquire(&entry->se_state) == SA_ENTRY_SUCC && > entry->se_inode) { > > This would make it obvious which variable was important, and would show > the paired synchronization points. If you think this is lame you should be the JIRA ticket and the original patch. It had zero info so I attempted to extract what I could out of the ticket. Hopefully Lai can fill in the details. I have no problems fixing this another way. I don't see a way in the ticket to easily reproduce this problem to see the new approach would fix it :-(
On Sun, Oct 21 2018, James Simmons wrote: >> On Sun, Oct 14 2018, James Simmons wrote: >> >> > From: Lai Siyao <lai.siyao@whamcloud.com> >> > >> > A barrier is missing before wake_up() in ll_statahead_interpret(), >> > which may cause 'ls' hang. Under the right conditions a basic 'ls' >> > can fail. The debug logs show: >> > >> > statahead.c:683:ll_statahead_interpret()) sa_entry software rc -13 >> > statahead.c:1666:ll_statahead()) revalidate statahead software: -11. >> > >> > Obviously statahead failure didn't notify 'ls' process in time. >> > The mi_cbdata can be stale so add a barrier before calling >> > wake_up(). >> > >> > Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com> >> > Signed-off-by: Bob Glossman <bob.glossman@intel.com> >> > WC-bug-id: https://jira.whamcloud.com/browse/LU-9210 >> > Reviewed-on: https://review.whamcloud.com/27330 >> > Reviewed-by: Nathaniel Clark <nclark@whamcloud.com> >> > Reviewed-by: Oleg Drokin <green@whamcloud.com> >> > Signed-off-by: James Simmons <jsimmons@infradead.org> >> > --- >> > drivers/staging/lustre/lustre/llite/statahead.c | 8 +++++++- >> > 1 file changed, 7 insertions(+), 1 deletion(-) >> > >> > diff --git a/drivers/staging/lustre/lustre/llite/statahead.c b/drivers/staging/lustre/lustre/llite/statahead.c >> > index 1ad308c..0174a4c 100644 >> > --- a/drivers/staging/lustre/lustre/llite/statahead.c >> > +++ b/drivers/staging/lustre/lustre/llite/statahead.c >> > @@ -680,8 +680,14 @@ static int ll_statahead_interpret(struct ptlrpc_request *req, >> > >> > spin_lock(&lli->lli_sa_lock); >> > if (rc) { >> > - if (__sa_make_ready(sai, entry, rc)) >> > + if (__sa_make_ready(sai, entry, rc)) { >> > + /* LU-9210 : Under the right conditions even 'ls' >> > + * can cause the statahead to fail. Using a memory >> > + * barrier resolves this issue. >> > + */ >> > + smp_mb(); >> > wake_up(&sai->sai_waitq); >> > + } >> > } else { >> > int first = 0; >> > entry->se_minfo = minfo; >> > -- >> > 1.8.3.1 >> >> Again, this is a fairly lame comment to justify the smp_mb(). >> It appears to me that the issue is most likely the value of >> entry->se_state. >> __sa_make_ready() sets this and revalidate_statahead_dentry tests it >> after waiting on sai_waitq. >> So I think it would be best if we changed __sa_make_ready() to >> >> smp_store_release(&entry->se_state, ret < 0 ? SA_ENTRY_INVA : SA_ENTRY_SUCC) >> >> and in ll_statahead_interpret() have >> >> if (smp_load_acquire(&entry->se_state) == SA_ENTRY_SUCC && >> entry->se_inode) { >> >> This would make it obvious which variable was important, and would show >> the paired synchronization points. > > If you think this is lame you should be the JIRA ticket and the original > patch. It had zero info so I attempted to extract what I could out of the > ticket. Hopefully Lai can fill in the details. I have no problems fixing > this another way. I don't see a way in the ticket to easily reproduce this > problem to see the new approach would fix it :-( I have imposed the version that I think is correct. See below. Thanks, NeilBrown --- a/drivers/staging/lustre/lustre/llite/statahead.c +++ b/drivers/staging/lustre/lustre/llite/statahead.c @@ -322,7 +322,11 @@ __sa_make_ready(struct ll_statahead_info *sai, struct sa_entry *entry, int ret) } } list_add(&entry->se_list, pos); - entry->se_state = ret < 0 ? SA_ENTRY_INVA : SA_ENTRY_SUCC; + /* + * LU-9210: ll_statahead_interpet must be able to see this before + * we wake it up + */ + smp_store_release(&entry->se_state, ret < 0 ? SA_ENTRY_INVA : SA_ENTRY_SUCC); return (index == sai->sai_index_wait); } @@ -1390,7 +1394,12 @@ static int revalidate_statahead_dentry(struct inode *dir, } } - if (entry->se_state == SA_ENTRY_SUCC && entry->se_inode) { + /* + * We need to see the value that was set immediately before we + * were woken up. + */ + if (smp_load_acquire(&entry->se_state) == SA_ENTRY_SUCC && + entry->se_inode) { struct inode *inode = entry->se_inode; struct lookup_intent it = { .it_op = IT_GETATTR, .it_lock_handle = entry->se_handle };
> On Sun, Oct 21 2018, James Simmons wrote: > > >> On Sun, Oct 14 2018, James Simmons wrote: > >> > >> > From: Lai Siyao <lai.siyao@whamcloud.com> > >> > > >> > A barrier is missing before wake_up() in ll_statahead_interpret(), > >> > which may cause 'ls' hang. Under the right conditions a basic 'ls' > >> > can fail. The debug logs show: > >> > > >> > statahead.c:683:ll_statahead_interpret()) sa_entry software rc -13 > >> > statahead.c:1666:ll_statahead()) revalidate statahead software: -11. > >> > > >> > Obviously statahead failure didn't notify 'ls' process in time. > >> > The mi_cbdata can be stale so add a barrier before calling > >> > wake_up(). > >> > > >> > Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com> > >> > Signed-off-by: Bob Glossman <bob.glossman@intel.com> > >> > WC-bug-id: https://jira.whamcloud.com/browse/LU-9210 > >> > Reviewed-on: https://review.whamcloud.com/27330 > >> > Reviewed-by: Nathaniel Clark <nclark@whamcloud.com> > >> > Reviewed-by: Oleg Drokin <green@whamcloud.com> > >> > Signed-off-by: James Simmons <jsimmons@infradead.org> > >> > --- > >> > drivers/staging/lustre/lustre/llite/statahead.c | 8 +++++++- > >> > 1 file changed, 7 insertions(+), 1 deletion(-) > >> > > >> > diff --git a/drivers/staging/lustre/lustre/llite/statahead.c b/drivers/staging/lustre/lustre/llite/statahead.c > >> > index 1ad308c..0174a4c 100644 > >> > --- a/drivers/staging/lustre/lustre/llite/statahead.c > >> > +++ b/drivers/staging/lustre/lustre/llite/statahead.c > >> > @@ -680,8 +680,14 @@ static int ll_statahead_interpret(struct ptlrpc_request *req, > >> > > >> > spin_lock(&lli->lli_sa_lock); > >> > if (rc) { > >> > - if (__sa_make_ready(sai, entry, rc)) > >> > + if (__sa_make_ready(sai, entry, rc)) { > >> > + /* LU-9210 : Under the right conditions even 'ls' > >> > + * can cause the statahead to fail. Using a memory > >> > + * barrier resolves this issue. > >> > + */ > >> > + smp_mb(); > >> > wake_up(&sai->sai_waitq); > >> > + } > >> > } else { > >> > int first = 0; > >> > entry->se_minfo = minfo; > >> > -- > >> > 1.8.3.1 > >> > >> Again, this is a fairly lame comment to justify the smp_mb(). > >> It appears to me that the issue is most likely the value of > >> entry->se_state. > >> __sa_make_ready() sets this and revalidate_statahead_dentry tests it > >> after waiting on sai_waitq. > >> So I think it would be best if we changed __sa_make_ready() to > >> > >> smp_store_release(&entry->se_state, ret < 0 ? SA_ENTRY_INVA : SA_ENTRY_SUCC) > >> > >> and in ll_statahead_interpret() have > >> > >> if (smp_load_acquire(&entry->se_state) == SA_ENTRY_SUCC && > >> entry->se_inode) { > >> > >> This would make it obvious which variable was important, and would show > >> the paired synchronization points. > > > > If you think this is lame you should be the JIRA ticket and the original > > patch. It had zero info so I attempted to extract what I could out of the > > ticket. Hopefully Lai can fill in the details. I have no problems fixing > > this another way. I don't see a way in the ticket to easily reproduce this > > problem to see the new approach would fix it :-( > > I have imposed the version that I think is correct. See below. I opened a ticket - https://jira.whamcloud.com/browse/LU-11616 and pushed this to the OpenSFS branch for full testing. The patch is at: https://review.whamcloud.com/#/c/33571 BTW I did not know about (total store order) TSO platforms and how some architectures don't support this property. An audit of the smp barriers might be a good idea for Lustre. For people not aware of TSO a good article on this is at: https://lwn.net/Articles/576486 > Thanks, > NeilBrown > > --- a/drivers/staging/lustre/lustre/llite/statahead.c > +++ b/drivers/staging/lustre/lustre/llite/statahead.c > @@ -322,7 +322,11 @@ __sa_make_ready(struct ll_statahead_info *sai, struct sa_entry *entry, int ret) > } > } > list_add(&entry->se_list, pos); > - entry->se_state = ret < 0 ? SA_ENTRY_INVA : SA_ENTRY_SUCC; > + /* > + * LU-9210: ll_statahead_interpet must be able to see this before > + * we wake it up > + */ > + smp_store_release(&entry->se_state, ret < 0 ? SA_ENTRY_INVA : SA_ENTRY_SUCC); > > return (index == sai->sai_index_wait); > } > @@ -1390,7 +1394,12 @@ static int revalidate_statahead_dentry(struct inode *dir, > } > } > > - if (entry->se_state == SA_ENTRY_SUCC && entry->se_inode) { > + /* > + * We need to see the value that was set immediately before we > + * were woken up. > + */ > + if (smp_load_acquire(&entry->se_state) == SA_ENTRY_SUCC && > + entry->se_inode) { > struct inode *inode = entry->se_inode; > struct lookup_intent it = { .it_op = IT_GETATTR, > .it_lock_handle = entry->se_handle }; >
diff --git a/drivers/staging/lustre/lustre/llite/statahead.c b/drivers/staging/lustre/lustre/llite/statahead.c index 1ad308c..0174a4c 100644 --- a/drivers/staging/lustre/lustre/llite/statahead.c +++ b/drivers/staging/lustre/lustre/llite/statahead.c @@ -680,8 +680,14 @@ static int ll_statahead_interpret(struct ptlrpc_request *req, spin_lock(&lli->lli_sa_lock); if (rc) { - if (__sa_make_ready(sai, entry, rc)) + if (__sa_make_ready(sai, entry, rc)) { + /* LU-9210 : Under the right conditions even 'ls' + * can cause the statahead to fail. Using a memory + * barrier resolves this issue. + */ + smp_mb(); wake_up(&sai->sai_waitq); + } } else { int first = 0; entry->se_minfo = minfo;