Message ID | 20220121071205.100648-3-joseph.qi@linux.alibaba.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | ocfs2: fix a deadlock case | expand |
Hi, This deadlock was originally reported by saeed.mirzamohammadi@oracle.com Could you please add Saeed as the reportedby. Thanks, Gautham. -----Original Message----- From: Joseph Qi <joseph.qi@linux.alibaba.com> Sent: Friday, January 21, 2022 12:42 PM To: akpm@linux-foundation.org; tytso@mit.edu; adilger.kernel@dilger.ca Cc: Gautham Ananthakrishna <gautham.ananthakrishna@oracle.com>; ocfs2-devel@oss.oracle.com; linux-ext4@vger.kernel.org Subject: [PATCH 2/2] ocfs2: fix a deadlock when commit trans commit 6f1b228529ae introduces a regression which can deadlock as follows: Task1: Task2: jbd2_journal_commit_transaction ocfs2_test_bg_bit_allocatable spin_lock(&jh->b_state_lock) jbd_lock_bh_journal_head __jbd2_journal_remove_checkpoint spin_lock(&jh->b_state_lock) jbd2_journal_put_journal_head jbd_lock_bh_journal_head Task1 and Task2 lock bh->b_state and jh->b_state_lock in different order, which finally result in a deadlock. So use jbd2_journal_[grab|put]_journal_head instead in ocfs2_test_bg_bit_allocatable() to fix it. Reported-by: Gautham Ananthakrishna <gautham.ananthakrishna@oracle.com> Fixes: 6f1b228529ae ("ocfs2: fix race between searching chunks and release journal_head from buffer_head") Cc: <stable@vger.kernel.org> Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com> --- fs/ocfs2/suballoc.c | 25 +++++++++++-------------- 1 file changed, 11 insertions(+), 14 deletions(-) diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c index 481017e1dac5..166c8918c825 100644 --- a/fs/ocfs2/suballoc.c +++ b/fs/ocfs2/suballoc.c @@ -1251,26 +1251,23 @@ static int ocfs2_test_bg_bit_allocatable(struct buffer_head *bg_bh, { struct ocfs2_group_desc *bg = (struct ocfs2_group_desc *) bg_bh->b_data; struct journal_head *jh; - int ret = 1; + int ret; if (ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap)) return 0; - if (!buffer_jbd(bg_bh)) + jh = jbd2_journal_grab_journal_head(bg_bh); + if (!jh) return 1; - jbd_lock_bh_journal_head(bg_bh); - if (buffer_jbd(bg_bh)) { - jh = bh2jh(bg_bh); - spin_lock(&jh->b_state_lock); - bg = (struct ocfs2_group_desc *) jh->b_committed_data; - if (bg) - ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap); - else - ret = 1; - spin_unlock(&jh->b_state_lock); - } - jbd_unlock_bh_journal_head(bg_bh); + spin_lock(&jh->b_state_lock); + bg = (struct ocfs2_group_desc *) jh->b_committed_data; + if (bg) + ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap); + else + ret = 1; + spin_unlock(&jh->b_state_lock); + jbd2_journal_put_journal_head(jh); return ret; } -- 2.19.1.6.gb485710b
Sure, will do it in v2. So could this patch resolve your issue? Thanks, Joseph On 1/23/22 1:31 PM, Gautham Ananthakrishna wrote: > Hi, > This deadlock was originally reported by saeed.mirzamohammadi@oracle.com Could you please add Saeed as the reportedby. > > Thanks, > Gautham. > > -----Original Message----- > From: Joseph Qi <joseph.qi@linux.alibaba.com> > Sent: Friday, January 21, 2022 12:42 PM > To: akpm@linux-foundation.org; tytso@mit.edu; adilger.kernel@dilger.ca > Cc: Gautham Ananthakrishna <gautham.ananthakrishna@oracle.com>; ocfs2-devel@oss.oracle.com; linux-ext4@vger.kernel.org > Subject: [PATCH 2/2] ocfs2: fix a deadlock when commit trans > > commit 6f1b228529ae introduces a regression which can deadlock as > follows: > > Task1: Task2: > jbd2_journal_commit_transaction ocfs2_test_bg_bit_allocatable > spin_lock(&jh->b_state_lock) jbd_lock_bh_journal_head > __jbd2_journal_remove_checkpoint spin_lock(&jh->b_state_lock) > jbd2_journal_put_journal_head > jbd_lock_bh_journal_head > > Task1 and Task2 lock bh->b_state and jh->b_state_lock in different order, which finally result in a deadlock. > > So use jbd2_journal_[grab|put]_journal_head instead in > ocfs2_test_bg_bit_allocatable() to fix it. > > Reported-by: Gautham Ananthakrishna <gautham.ananthakrishna@oracle.com> > Fixes: 6f1b228529ae ("ocfs2: fix race between searching chunks and release journal_head from buffer_head") > Cc: <stable@vger.kernel.org> > Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com> > --- > fs/ocfs2/suballoc.c | 25 +++++++++++-------------- > 1 file changed, 11 insertions(+), 14 deletions(-) > > diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c index 481017e1dac5..166c8918c825 100644 > --- a/fs/ocfs2/suballoc.c > +++ b/fs/ocfs2/suballoc.c > @@ -1251,26 +1251,23 @@ static int ocfs2_test_bg_bit_allocatable(struct buffer_head *bg_bh, { > struct ocfs2_group_desc *bg = (struct ocfs2_group_desc *) bg_bh->b_data; > struct journal_head *jh; > - int ret = 1; > + int ret; > > if (ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap)) > return 0; > > - if (!buffer_jbd(bg_bh)) > + jh = jbd2_journal_grab_journal_head(bg_bh); > + if (!jh) > return 1; > > - jbd_lock_bh_journal_head(bg_bh); > - if (buffer_jbd(bg_bh)) { > - jh = bh2jh(bg_bh); > - spin_lock(&jh->b_state_lock); > - bg = (struct ocfs2_group_desc *) jh->b_committed_data; > - if (bg) > - ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap); > - else > - ret = 1; > - spin_unlock(&jh->b_state_lock); > - } > - jbd_unlock_bh_journal_head(bg_bh); > + spin_lock(&jh->b_state_lock); > + bg = (struct ocfs2_group_desc *) jh->b_committed_data; > + if (bg) > + ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap); > + else > + ret = 1; > + spin_unlock(&jh->b_state_lock); > + jbd2_journal_put_journal_head(jh); > > return ret; > } > -- > 2.19.1.6.gb485710b
Yes. The patch has resolved the issue. Thanks, Gautham. -----Original Message----- From: Joseph Qi <joseph.qi@linux.alibaba.com> Sent: Monday, January 24, 2022 8:08 AM To: Gautham Ananthakrishna <gautham.ananthakrishna@oracle.com>; akpm@linux-foundation.org; tytso@mit.edu; adilger.kernel@dilger.ca Cc: ocfs2-devel@oss.oracle.com; linux-ext4@vger.kernel.org; Saeed Mirzamohammadi <saeed.mirzamohammadi@oracle.com> Subject: Re: [PATCH 2/2] ocfs2: fix a deadlock when commit trans Sure, will do it in v2. So could this patch resolve your issue? Thanks, Joseph On 1/23/22 1:31 PM, Gautham Ananthakrishna wrote: > Hi, > This deadlock was originally reported by saeed.mirzamohammadi@oracle.com Could you please add Saeed as the reportedby. > > Thanks, > Gautham. > > -----Original Message----- > From: Joseph Qi <joseph.qi@linux.alibaba.com> > Sent: Friday, January 21, 2022 12:42 PM > To: akpm@linux-foundation.org; tytso@mit.edu; adilger.kernel@dilger.ca > Cc: Gautham Ananthakrishna <gautham.ananthakrishna@oracle.com>; ocfs2-devel@oss.oracle.com; linux-ext4@vger.kernel.org > Subject: [PATCH 2/2] ocfs2: fix a deadlock when commit trans > > commit 6f1b228529ae introduces a regression which can deadlock as > follows: > > Task1: Task2: > jbd2_journal_commit_transaction ocfs2_test_bg_bit_allocatable > spin_lock(&jh->b_state_lock) jbd_lock_bh_journal_head > __jbd2_journal_remove_checkpoint spin_lock(&jh->b_state_lock) > jbd2_journal_put_journal_head > jbd_lock_bh_journal_head > > Task1 and Task2 lock bh->b_state and jh->b_state_lock in different order, which finally result in a deadlock. > > So use jbd2_journal_[grab|put]_journal_head instead in > ocfs2_test_bg_bit_allocatable() to fix it. > > Reported-by: Gautham Ananthakrishna <gautham.ananthakrishna@oracle.com> > Fixes: 6f1b228529ae ("ocfs2: fix race between searching chunks and release journal_head from buffer_head") > Cc: <stable@vger.kernel.org> > Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com> > --- > fs/ocfs2/suballoc.c | 25 +++++++++++-------------- > 1 file changed, 11 insertions(+), 14 deletions(-) > > diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c index 481017e1dac5..166c8918c825 100644 > --- a/fs/ocfs2/suballoc.c > +++ b/fs/ocfs2/suballoc.c > @@ -1251,26 +1251,23 @@ static int ocfs2_test_bg_bit_allocatable(struct buffer_head *bg_bh, { > struct ocfs2_group_desc *bg = (struct ocfs2_group_desc *) bg_bh->b_data; > struct journal_head *jh; > - int ret = 1; > + int ret; > > if (ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap)) > return 0; > > - if (!buffer_jbd(bg_bh)) > + jh = jbd2_journal_grab_journal_head(bg_bh); > + if (!jh) > return 1; > > - jbd_lock_bh_journal_head(bg_bh); > - if (buffer_jbd(bg_bh)) { > - jh = bh2jh(bg_bh); > - spin_lock(&jh->b_state_lock); > - bg = (struct ocfs2_group_desc *) jh->b_committed_data; > - if (bg) > - ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap); > - else > - ret = 1; > - spin_unlock(&jh->b_state_lock); > - } > - jbd_unlock_bh_journal_head(bg_bh); > + spin_lock(&jh->b_state_lock); > + bg = (struct ocfs2_group_desc *) jh->b_committed_data; > + if (bg) > + ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap); > + else > + ret = 1; > + spin_unlock(&jh->b_state_lock); > + jbd2_journal_put_journal_head(jh); > > return ret; > } > -- > 2.19.1.6.gb485710b
diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c index 481017e1dac5..166c8918c825 100644 --- a/fs/ocfs2/suballoc.c +++ b/fs/ocfs2/suballoc.c @@ -1251,26 +1251,23 @@ static int ocfs2_test_bg_bit_allocatable(struct buffer_head *bg_bh, { struct ocfs2_group_desc *bg = (struct ocfs2_group_desc *) bg_bh->b_data; struct journal_head *jh; - int ret = 1; + int ret; if (ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap)) return 0; - if (!buffer_jbd(bg_bh)) + jh = jbd2_journal_grab_journal_head(bg_bh); + if (!jh) return 1; - jbd_lock_bh_journal_head(bg_bh); - if (buffer_jbd(bg_bh)) { - jh = bh2jh(bg_bh); - spin_lock(&jh->b_state_lock); - bg = (struct ocfs2_group_desc *) jh->b_committed_data; - if (bg) - ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap); - else - ret = 1; - spin_unlock(&jh->b_state_lock); - } - jbd_unlock_bh_journal_head(bg_bh); + spin_lock(&jh->b_state_lock); + bg = (struct ocfs2_group_desc *) jh->b_committed_data; + if (bg) + ret = !ocfs2_test_bit(nr, (unsigned long *)bg->bg_bitmap); + else + ret = 1; + spin_unlock(&jh->b_state_lock); + jbd2_journal_put_journal_head(jh); return ret; }
commit 6f1b228529ae introduces a regression which can deadlock as follows: Task1: Task2: jbd2_journal_commit_transaction ocfs2_test_bg_bit_allocatable spin_lock(&jh->b_state_lock) jbd_lock_bh_journal_head __jbd2_journal_remove_checkpoint spin_lock(&jh->b_state_lock) jbd2_journal_put_journal_head jbd_lock_bh_journal_head Task1 and Task2 lock bh->b_state and jh->b_state_lock in different order, which finally result in a deadlock. So use jbd2_journal_[grab|put]_journal_head instead in ocfs2_test_bg_bit_allocatable() to fix it. Reported-by: Gautham Ananthakrishna <gautham.ananthakrishna@oracle.com> Fixes: 6f1b228529ae ("ocfs2: fix race between searching chunks and release journal_head from buffer_head") Cc: <stable@vger.kernel.org> Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com> --- fs/ocfs2/suballoc.c | 25 +++++++++++-------------- 1 file changed, 11 insertions(+), 14 deletions(-)