Message ID | 5679AF1E.3020304@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Reviewed-by: Mark Fasheh <mfasheh@suse.de> On Tue, Dec 22, 2015 at 12:14:22PM -0800, Tariq Saeed wrote: > Hi, > Looks like this fell into through the cracks. This is a very real bug > encountered by Luminex Software and they tested the fix. > Regards > -Tariq > > > -------- Forwarded Message -------- > Subject: [Ocfs2-devel] [PATCH] NFS hangs in __ocfs2_cluster_lock due to race > with ocfs2_unblock_lock > Date: Tue, 25 Aug 2015 13:55:30 -0700 > From: Tariq Saeed <tariq.x.saeed@oracle.com> > To: ocfs2-devel@oss.oracle.com > CC: mfasheh@suse.de > > > > Orabug: 20933419 > > NFS on a 2 node ocfs2 cluster each node exporting dir. The lock causing > the hang is the global bit map inode lock. Node 1 is master, has > the lock granted in PR mode; Node 2 is in the converting list (PR -> > EX). There are no holders of the lock on the master node so it should > downconvert to NL and grant EX to node 2 but that does not happen. > BLOCKED + QUEUED in lock res are set and it is on osb blocked list. > Threads are waiting in __ocfs2_cluster_lock on BLOCKED. One thread wants > EX, rest want PR. So it is as though the downconvert thread needs to be > kicked to complete the conv. > > The hang is caused by an EX req coming into __ocfs2_cluster_lock on > the heels of a PR req after it sets BUSY (drops l_lock, releasing EX > thread), forcing the incoming EX to wait on BUSY without doing anything. > PR has called ocfs2_dlm_lock, which sets the node 1 lock from NL -> > PR, queues ast. > > At this time, upconvert (PR ->EX) arrives from node 2, finds conflict with > node 1 lock in PR, so the lock res is put on dlm thread's dirty listt. > > After ret from ocf2_dlm_lock, PR thread now waits behind EX on BUSY till > awoken by ast. > > Now it is dlm_thread that serially runs dlm_shuffle_lists, ast, bast, > in that order. dlm_shuffle_lists ques a bast on behalf of node 2 > (which will be run by dlm_thread right after the ast). ast does its > part, sets UPCONVERT_FINISHING, clears BUSY and wakes its waiters. Next, > dlm_thread runs bast. It sets BLOCKED and kicks dc thread. dc thread > runs ocfs2_unblock_lock, but since UPCONVERT_FINISHING set, skips doing > anything and reques. > > Inside of __ocfs2_cluster_lock, since EX has been waiting on BUSY ahead > of PR, it wakes up first, finds BLOCKED set and skips doing anything > but clearing UPCONVERT_FINISHING (which was actually "meant" for the > PR thread), and this time waits on BLOCKED. Next, the PR thread comes > out of wait but since UPCONVERT_FINISHING is not set, it skips updating > the l_ro_holders and goes straight to wait on BLOCKED. So there, we > have a hang! Threads in __ocfs2_cluster_lock wait on BLOCKED, lock > res in osb blocked list. Only when dc thread is awoken, it will run > ocfs2_unblock_lock and things will unhang. > > One way to fix this is to wake the dc thread on the flag after clearing > UPCONVERT_FINISHING > > Signed-off-by: Tariq Saeed <tariq.x.saeed@oracle.com> > Reviewed-by: Wengang Wang <wen.gang.wang@oracle.com> > Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> > --- > fs/ocfs2/dlmglue.c | 6 ++++++ > 1 files changed, 6 insertions(+), 0 deletions(-) > > diff --git a/fs/ocfs2/dlmglue.c b/fs/ocfs2/dlmglue.c > index 8b23aa2..313c816 100644 > --- a/fs/ocfs2/dlmglue.c > +++ b/fs/ocfs2/dlmglue.c > @@ -1390,6 +1390,7 @@ static int __ocfs2_cluster_lock(struct ocfs2_super *osb, > unsigned int gen; > int noqueue_attempted = 0; > int dlm_locked = 0; > + int kick_dc = 0; > > if (!(lockres->l_flags & OCFS2_LOCK_INITIALIZED)) { > mlog_errno(-EINVAL); > @@ -1524,7 +1525,12 @@ update_holders: > unlock: > lockres_clear_flags(lockres, OCFS2_LOCK_UPCONVERT_FINISHING); > > + /* ocfs2_unblock_lock reques on seeing OCFS2_LOCK_UPCONVERT_FINISHING */ > + kick_dc = (lockres->l_flags & OCFS2_LOCK_BLOCKED); > + > spin_unlock_irqrestore(&lockres->l_lock, flags); > + if (kick_dc) > + ocfs2_wake_downconvert_thread(osb); > out: > /* > * This is helping work around a lock inversion between the page lock > -- > 1.7.1 > > > _______________________________________________ > Ocfs2-devel mailing list > Ocfs2-devel@oss.oracle.com > https://oss.oracle.com/mailman/listinfo/ocfs2-devel > > > -- Mark Fasheh
On Tue, 22 Dec 2015 12:14:22 -0800 Tariq Saeed <tariq.x.saeed@oracle.com> wrote: > Hi, > Looks like this fell into through the cracks. This is a very real bug > encountered by Luminex Software and they tested the fix. > Regards > -Tariq > > > -------- Forwarded Message -------- > Subject: [Ocfs2-devel] [PATCH] NFS hangs in __ocfs2_cluster_lock due to > race with ocfs2_unblock_lock I added a cc:stable to this so it gets backported.
diff --git a/fs/ocfs2/dlmglue.c b/fs/ocfs2/dlmglue.c index 8b23aa2..313c816 100644 --- a/fs/ocfs2/dlmglue.c +++ b/fs/ocfs2/dlmglue.c @@ -1390,6 +1390,7 @@ static int __ocfs2_cluster_lock(struct ocfs2_super *osb, unsigned int gen; int noqueue_attempted = 0; int dlm_locked = 0; + int kick_dc = 0; if (!(lockres->l_flags & OCFS2_LOCK_INITIALIZED)) { mlog_errno(-EINVAL); @@ -1524,7 +1525,12 @@ update_holders: unlock: lockres_clear_flags(lockres, OCFS2_LOCK_UPCONVERT_FINISHING); + /* ocfs2_unblock_lock reques on seeing OCFS2_LOCK_UPCONVERT_FINISHING */ + kick_dc = (lockres->l_flags & OCFS2_LOCK_BLOCKED); + spin_unlock_irqrestore(&lockres->l_lock, flags); + if (kick_dc) + ocfs2_wake_downconvert_thread(osb); out: /* * This is helping work around a lock inversion between the page lock