Message ID | 1589961079-27932-1-git-send-email-xiubli@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | ceph: make sure the mdsc->mutex is nested in s->s_mutex to fix dead lock | expand |
On Wed, 2020-05-20 at 03:51 -0400, xiubli@redhat.com wrote: > From: Xiubo Li <xiubli@redhat.com> > > The call trace: > > <6>[15981.740583] libceph: mon2 (1)10.72.36.245:40083 session lost, hunting for new mon > <3>[16097.960293] INFO: task kworker/18:1:32111 blocked for more than 122 seconds. > <3>[16097.960860] Tainted: G E 5.7.0-rc5+ #80 > <3>[16097.961332] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > <6>[16097.961868] kworker/18:1 D 0 32111 2 0x80004080 > <6>[16097.962151] Workqueue: ceph-msgr ceph_con_workfn [libceph] > <4>[16097.962188] Call Trace: > <4>[16097.962353] ? __schedule+0x276/0x6e0 > <4>[16097.962359] ? schedule+0x40/0xb0 > <4>[16097.962364] ? schedule_preempt_disabled+0xa/0x10 > <4>[16097.962368] ? __mutex_lock.isra.8+0x2b5/0x4a0 > <4>[16097.962460] ? kick_requests+0x21/0x100 [ceph] > <4>[16097.962485] ? ceph_mdsc_handle_mdsmap+0x19c/0x5f0 [ceph] > <4>[16097.962503] ? extra_mon_dispatch+0x34/0x40 [ceph] > <4>[16097.962523] ? extra_mon_dispatch+0x34/0x40 [ceph] > <4>[16097.962580] ? dispatch+0x77/0x930 [libceph] > <4>[16097.962602] ? try_read+0x78b/0x11e0 [libceph] > <4>[16097.962619] ? __switch_to_asm+0x40/0x70 > <4>[16097.962623] ? __switch_to_asm+0x34/0x70 > <4>[16097.962627] ? __switch_to_asm+0x40/0x70 > <4>[16097.962631] ? __switch_to_asm+0x34/0x70 > <4>[16097.962635] ? __switch_to_asm+0x40/0x70 > <4>[16097.962654] ? ceph_con_workfn+0x130/0x5e0 [libceph] > <4>[16097.962713] ? process_one_work+0x1ad/0x370 > <4>[16097.962717] ? worker_thread+0x30/0x390 > <4>[16097.962722] ? create_worker+0x1a0/0x1a0 > <4>[16097.962737] ? kthread+0x112/0x130 > <4>[16097.962742] ? kthread_park+0x80/0x80 > <4>[16097.962747] ? ret_from_fork+0x35/0x40 > <3>[16097.962758] INFO: task kworker/25:1:1747 blocked for more than 122 seconds. > <3>[16097.963233] Tainted: G E 5.7.0-rc5+ #80 > <3>[16097.963792] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > <6>[16097.964298] kworker/25:1 D 0 1747 2 0x80004080 > <6>[16097.964325] Workqueue: ceph-msgr ceph_con_workfn [libceph] > <4>[16097.964331] Call Trace: > <4>[16097.964340] ? __schedule+0x276/0x6e0 > <4>[16097.964344] ? schedule+0x40/0xb0 > <4>[16097.964347] ? schedule_preempt_disabled+0xa/0x10 > <4>[16097.964351] ? __mutex_lock.isra.8+0x2b5/0x4a0 > <4>[16097.964376] ? handle_reply+0x33f/0x6f0 [ceph] > <4>[16097.964407] ? dispatch+0xa6/0xbc0 [ceph] > <4>[16097.964429] ? read_partial_message+0x214/0x770 [libceph] > <4>[16097.964449] ? try_read+0x78b/0x11e0 [libceph] > <4>[16097.964454] ? __switch_to_asm+0x40/0x70 > <4>[16097.964458] ? __switch_to_asm+0x34/0x70 > <4>[16097.964461] ? __switch_to_asm+0x40/0x70 > <4>[16097.964465] ? __switch_to_asm+0x34/0x70 > <4>[16097.964470] ? __switch_to_asm+0x40/0x70 > <4>[16097.964489] ? ceph_con_workfn+0x130/0x5e0 [libceph] > <4>[16097.964494] ? process_one_work+0x1ad/0x370 > <4>[16097.964498] ? worker_thread+0x30/0x390 > <4>[16097.964501] ? create_worker+0x1a0/0x1a0 > <4>[16097.964506] ? kthread+0x112/0x130 > <4>[16097.964511] ? kthread_park+0x80/0x80 > <4>[16097.964516] ? ret_from_fork+0x35/0x40 > > URL: https://tracker.ceph.com/issues/45609 > Reported-by: "Yan, Zheng" <zyan@redhat.com> > Signed-off-by: Xiubo Li <xiubli@redhat.com> > --- > fs/ceph/mds_client.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c > index 6c283c5..0e0ab01 100644 > --- a/fs/ceph/mds_client.c > +++ b/fs/ceph/mds_client.c > @@ -3769,8 +3769,6 @@ static int encode_snap_realms(struct ceph_mds_client *mdsc, > * recovering MDS might have. > * > * This is a relatively heavyweight operation, but it's rare. > - * > - * called with mdsc->mutex held. > */ > static void send_mds_reconnect(struct ceph_mds_client *mdsc, > struct ceph_mds_session *session) > @@ -4024,7 +4022,9 @@ static void check_new_map(struct ceph_mds_client *mdsc, > oldstate != CEPH_MDS_STATE_STARTING) > pr_info("mds%d recovery completed\n", s->s_mds); > kick_requests(mdsc, i); > + mutex_unlock(&mdsc->mutex); > mutex_lock(&s->s_mutex); > + mutex_lock(&mdsc->mutex); > ceph_kick_flushing_caps(mdsc, s); > mutex_unlock(&s->s_mutex); > wake_up_session_caps(s, RECONNECT); Good catch. Merged into testing branch. Thanks!
On Wed, May 20, 2020 at 1:44 PM Jeff Layton <jlayton@kernel.org> wrote: > > On Wed, 2020-05-20 at 03:51 -0400, xiubli@redhat.com wrote: > > From: Xiubo Li <xiubli@redhat.com> > > > > The call trace: > > > > <6>[15981.740583] libceph: mon2 (1)10.72.36.245:40083 session lost, hunting for new mon > > <3>[16097.960293] INFO: task kworker/18:1:32111 blocked for more than 122 seconds. > > <3>[16097.960860] Tainted: G E 5.7.0-rc5+ #80 > > <3>[16097.961332] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > <6>[16097.961868] kworker/18:1 D 0 32111 2 0x80004080 > > <6>[16097.962151] Workqueue: ceph-msgr ceph_con_workfn [libceph] > > <4>[16097.962188] Call Trace: > > <4>[16097.962353] ? __schedule+0x276/0x6e0 > > <4>[16097.962359] ? schedule+0x40/0xb0 > > <4>[16097.962364] ? schedule_preempt_disabled+0xa/0x10 > > <4>[16097.962368] ? __mutex_lock.isra.8+0x2b5/0x4a0 > > <4>[16097.962460] ? kick_requests+0x21/0x100 [ceph] > > <4>[16097.962485] ? ceph_mdsc_handle_mdsmap+0x19c/0x5f0 [ceph] > > <4>[16097.962503] ? extra_mon_dispatch+0x34/0x40 [ceph] > > <4>[16097.962523] ? extra_mon_dispatch+0x34/0x40 [ceph] > > <4>[16097.962580] ? dispatch+0x77/0x930 [libceph] > > <4>[16097.962602] ? try_read+0x78b/0x11e0 [libceph] > > <4>[16097.962619] ? __switch_to_asm+0x40/0x70 > > <4>[16097.962623] ? __switch_to_asm+0x34/0x70 > > <4>[16097.962627] ? __switch_to_asm+0x40/0x70 > > <4>[16097.962631] ? __switch_to_asm+0x34/0x70 > > <4>[16097.962635] ? __switch_to_asm+0x40/0x70 > > <4>[16097.962654] ? ceph_con_workfn+0x130/0x5e0 [libceph] > > <4>[16097.962713] ? process_one_work+0x1ad/0x370 > > <4>[16097.962717] ? worker_thread+0x30/0x390 > > <4>[16097.962722] ? create_worker+0x1a0/0x1a0 > > <4>[16097.962737] ? kthread+0x112/0x130 > > <4>[16097.962742] ? kthread_park+0x80/0x80 > > <4>[16097.962747] ? ret_from_fork+0x35/0x40 > > <3>[16097.962758] INFO: task kworker/25:1:1747 blocked for more than 122 seconds. > > <3>[16097.963233] Tainted: G E 5.7.0-rc5+ #80 > > <3>[16097.963792] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > <6>[16097.964298] kworker/25:1 D 0 1747 2 0x80004080 > > <6>[16097.964325] Workqueue: ceph-msgr ceph_con_workfn [libceph] > > <4>[16097.964331] Call Trace: > > <4>[16097.964340] ? __schedule+0x276/0x6e0 > > <4>[16097.964344] ? schedule+0x40/0xb0 > > <4>[16097.964347] ? schedule_preempt_disabled+0xa/0x10 > > <4>[16097.964351] ? __mutex_lock.isra.8+0x2b5/0x4a0 > > <4>[16097.964376] ? handle_reply+0x33f/0x6f0 [ceph] > > <4>[16097.964407] ? dispatch+0xa6/0xbc0 [ceph] > > <4>[16097.964429] ? read_partial_message+0x214/0x770 [libceph] > > <4>[16097.964449] ? try_read+0x78b/0x11e0 [libceph] > > <4>[16097.964454] ? __switch_to_asm+0x40/0x70 > > <4>[16097.964458] ? __switch_to_asm+0x34/0x70 > > <4>[16097.964461] ? __switch_to_asm+0x40/0x70 > > <4>[16097.964465] ? __switch_to_asm+0x34/0x70 > > <4>[16097.964470] ? __switch_to_asm+0x40/0x70 > > <4>[16097.964489] ? ceph_con_workfn+0x130/0x5e0 [libceph] > > <4>[16097.964494] ? process_one_work+0x1ad/0x370 > > <4>[16097.964498] ? worker_thread+0x30/0x390 > > <4>[16097.964501] ? create_worker+0x1a0/0x1a0 > > <4>[16097.964506] ? kthread+0x112/0x130 > > <4>[16097.964511] ? kthread_park+0x80/0x80 > > <4>[16097.964516] ? ret_from_fork+0x35/0x40 > > > > URL: https://tracker.ceph.com/issues/45609 > > Reported-by: "Yan, Zheng" <zyan@redhat.com> > > Signed-off-by: Xiubo Li <xiubli@redhat.com> > > --- > > fs/ceph/mds_client.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c > > index 6c283c5..0e0ab01 100644 > > --- a/fs/ceph/mds_client.c > > +++ b/fs/ceph/mds_client.c > > @@ -3769,8 +3769,6 @@ static int encode_snap_realms(struct ceph_mds_client *mdsc, > > * recovering MDS might have. > > * > > * This is a relatively heavyweight operation, but it's rare. > > - * > > - * called with mdsc->mutex held. > > */ > > static void send_mds_reconnect(struct ceph_mds_client *mdsc, > > struct ceph_mds_session *session) > > @@ -4024,7 +4022,9 @@ static void check_new_map(struct ceph_mds_client *mdsc, > > oldstate != CEPH_MDS_STATE_STARTING) > > pr_info("mds%d recovery completed\n", s->s_mds); > > kick_requests(mdsc, i); > > + mutex_unlock(&mdsc->mutex); > > mutex_lock(&s->s_mutex); > > + mutex_lock(&mdsc->mutex); > > ceph_kick_flushing_caps(mdsc, s); > > mutex_unlock(&s->s_mutex); > > wake_up_session_caps(s, RECONNECT); > > > Good catch. Merged into testing branch. These stack traces take up the entire screen and provide very little information. Since this is a simple lock ordering issue, could they be replaced with a short description and perhaps a reference to mds_client.h where lock dependencies are documented? Thanks, Ilya
On 2020/5/27 20:25, Ilya Dryomov wrote: > On Wed, May 20, 2020 at 1:44 PM Jeff Layton <jlayton@kernel.org> wrote: >> On Wed, 2020-05-20 at 03:51 -0400, xiubli@redhat.com wrote: >>> From: Xiubo Li <xiubli@redhat.com> >>> >>> The call trace: >>> >>> <6>[15981.740583] libceph: mon2 (1)10.72.36.245:40083 session lost, hunting for new mon >>> <3>[16097.960293] INFO: task kworker/18:1:32111 blocked for more than 122 seconds. >>> <3>[16097.960860] Tainted: G E 5.7.0-rc5+ #80 >>> <3>[16097.961332] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >>> <6>[16097.961868] kworker/18:1 D 0 32111 2 0x80004080 >>> <6>[16097.962151] Workqueue: ceph-msgr ceph_con_workfn [libceph] >>> <4>[16097.962188] Call Trace: >>> <4>[16097.962353] ? __schedule+0x276/0x6e0 >>> <4>[16097.962359] ? schedule+0x40/0xb0 >>> <4>[16097.962364] ? schedule_preempt_disabled+0xa/0x10 >>> <4>[16097.962368] ? __mutex_lock.isra.8+0x2b5/0x4a0 >>> <4>[16097.962460] ? kick_requests+0x21/0x100 [ceph] >>> <4>[16097.962485] ? ceph_mdsc_handle_mdsmap+0x19c/0x5f0 [ceph] >>> <4>[16097.962503] ? extra_mon_dispatch+0x34/0x40 [ceph] >>> <4>[16097.962523] ? extra_mon_dispatch+0x34/0x40 [ceph] >>> <4>[16097.962580] ? dispatch+0x77/0x930 [libceph] >>> <4>[16097.962602] ? try_read+0x78b/0x11e0 [libceph] >>> <4>[16097.962619] ? __switch_to_asm+0x40/0x70 >>> <4>[16097.962623] ? __switch_to_asm+0x34/0x70 >>> <4>[16097.962627] ? __switch_to_asm+0x40/0x70 >>> <4>[16097.962631] ? __switch_to_asm+0x34/0x70 >>> <4>[16097.962635] ? __switch_to_asm+0x40/0x70 >>> <4>[16097.962654] ? ceph_con_workfn+0x130/0x5e0 [libceph] >>> <4>[16097.962713] ? process_one_work+0x1ad/0x370 >>> <4>[16097.962717] ? worker_thread+0x30/0x390 >>> <4>[16097.962722] ? create_worker+0x1a0/0x1a0 >>> <4>[16097.962737] ? kthread+0x112/0x130 >>> <4>[16097.962742] ? kthread_park+0x80/0x80 >>> <4>[16097.962747] ? ret_from_fork+0x35/0x40 >>> <3>[16097.962758] INFO: task kworker/25:1:1747 blocked for more than 122 seconds. >>> <3>[16097.963233] Tainted: G E 5.7.0-rc5+ #80 >>> <3>[16097.963792] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >>> <6>[16097.964298] kworker/25:1 D 0 1747 2 0x80004080 >>> <6>[16097.964325] Workqueue: ceph-msgr ceph_con_workfn [libceph] >>> <4>[16097.964331] Call Trace: >>> <4>[16097.964340] ? __schedule+0x276/0x6e0 >>> <4>[16097.964344] ? schedule+0x40/0xb0 >>> <4>[16097.964347] ? schedule_preempt_disabled+0xa/0x10 >>> <4>[16097.964351] ? __mutex_lock.isra.8+0x2b5/0x4a0 >>> <4>[16097.964376] ? handle_reply+0x33f/0x6f0 [ceph] >>> <4>[16097.964407] ? dispatch+0xa6/0xbc0 [ceph] >>> <4>[16097.964429] ? read_partial_message+0x214/0x770 [libceph] >>> <4>[16097.964449] ? try_read+0x78b/0x11e0 [libceph] >>> <4>[16097.964454] ? __switch_to_asm+0x40/0x70 >>> <4>[16097.964458] ? __switch_to_asm+0x34/0x70 >>> <4>[16097.964461] ? __switch_to_asm+0x40/0x70 >>> <4>[16097.964465] ? __switch_to_asm+0x34/0x70 >>> <4>[16097.964470] ? __switch_to_asm+0x40/0x70 >>> <4>[16097.964489] ? ceph_con_workfn+0x130/0x5e0 [libceph] >>> <4>[16097.964494] ? process_one_work+0x1ad/0x370 >>> <4>[16097.964498] ? worker_thread+0x30/0x390 >>> <4>[16097.964501] ? create_worker+0x1a0/0x1a0 >>> <4>[16097.964506] ? kthread+0x112/0x130 >>> <4>[16097.964511] ? kthread_park+0x80/0x80 >>> <4>[16097.964516] ? ret_from_fork+0x35/0x40 >>> >>> URL: https://tracker.ceph.com/issues/45609 >>> Reported-by: "Yan, Zheng" <zyan@redhat.com> >>> Signed-off-by: Xiubo Li <xiubli@redhat.com> >>> --- >>> fs/ceph/mds_client.c | 4 ++-- >>> 1 file changed, 2 insertions(+), 2 deletions(-) >>> >>> diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c >>> index 6c283c5..0e0ab01 100644 >>> --- a/fs/ceph/mds_client.c >>> +++ b/fs/ceph/mds_client.c >>> @@ -3769,8 +3769,6 @@ static int encode_snap_realms(struct ceph_mds_client *mdsc, >>> * recovering MDS might have. >>> * >>> * This is a relatively heavyweight operation, but it's rare. >>> - * >>> - * called with mdsc->mutex held. >>> */ >>> static void send_mds_reconnect(struct ceph_mds_client *mdsc, >>> struct ceph_mds_session *session) >>> @@ -4024,7 +4022,9 @@ static void check_new_map(struct ceph_mds_client *mdsc, >>> oldstate != CEPH_MDS_STATE_STARTING) >>> pr_info("mds%d recovery completed\n", s->s_mds); >>> kick_requests(mdsc, i); >>> + mutex_unlock(&mdsc->mutex); >>> mutex_lock(&s->s_mutex); >>> + mutex_lock(&mdsc->mutex); >>> ceph_kick_flushing_caps(mdsc, s); >>> mutex_unlock(&s->s_mutex); >>> wake_up_session_caps(s, RECONNECT); >> >> Good catch. Merged into testing branch. > These stack traces take up the entire screen and provide very > little information. Since this is a simple lock ordering issue, > could they be replaced with a short description and perhaps a > reference to mds_client.h where lock dependencies are documented? Hmm, it makes sense. Should I post a V2 ? Or will Jeff do the fix on the testing branch ? Thanks BRs Xiubo > Thanks, > > Ilya >
On Wed, 2020-05-27 at 20:47 +0800, Xiubo Li wrote: > On 2020/5/27 20:25, Ilya Dryomov wrote: > > On Wed, May 20, 2020 at 1:44 PM Jeff Layton <jlayton@kernel.org> wrote: > > > On Wed, 2020-05-20 at 03:51 -0400, xiubli@redhat.com wrote: > > > > From: Xiubo Li <xiubli@redhat.com> > > > > > > > > The call trace: > > > > > > > > <6>[15981.740583] libceph: mon2 (1)10.72.36.245:40083 session lost, hunting for new mon > > > > <3>[16097.960293] INFO: task kworker/18:1:32111 blocked for more than 122 seconds. > > > > <3>[16097.960860] Tainted: G E 5.7.0-rc5+ #80 > > > > <3>[16097.961332] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > > > <6>[16097.961868] kworker/18:1 D 0 32111 2 0x80004080 > > > > <6>[16097.962151] Workqueue: ceph-msgr ceph_con_workfn [libceph] > > > > <4>[16097.962188] Call Trace: > > > > <4>[16097.962353] ? __schedule+0x276/0x6e0 > > > > <4>[16097.962359] ? schedule+0x40/0xb0 > > > > <4>[16097.962364] ? schedule_preempt_disabled+0xa/0x10 > > > > <4>[16097.962368] ? __mutex_lock.isra.8+0x2b5/0x4a0 > > > > <4>[16097.962460] ? kick_requests+0x21/0x100 [ceph] > > > > <4>[16097.962485] ? ceph_mdsc_handle_mdsmap+0x19c/0x5f0 [ceph] > > > > <4>[16097.962503] ? extra_mon_dispatch+0x34/0x40 [ceph] > > > > <4>[16097.962523] ? extra_mon_dispatch+0x34/0x40 [ceph] > > > > <4>[16097.962580] ? dispatch+0x77/0x930 [libceph] > > > > <4>[16097.962602] ? try_read+0x78b/0x11e0 [libceph] > > > > <4>[16097.962619] ? __switch_to_asm+0x40/0x70 > > > > <4>[16097.962623] ? __switch_to_asm+0x34/0x70 > > > > <4>[16097.962627] ? __switch_to_asm+0x40/0x70 > > > > <4>[16097.962631] ? __switch_to_asm+0x34/0x70 > > > > <4>[16097.962635] ? __switch_to_asm+0x40/0x70 > > > > <4>[16097.962654] ? ceph_con_workfn+0x130/0x5e0 [libceph] > > > > <4>[16097.962713] ? process_one_work+0x1ad/0x370 > > > > <4>[16097.962717] ? worker_thread+0x30/0x390 > > > > <4>[16097.962722] ? create_worker+0x1a0/0x1a0 > > > > <4>[16097.962737] ? kthread+0x112/0x130 > > > > <4>[16097.962742] ? kthread_park+0x80/0x80 > > > > <4>[16097.962747] ? ret_from_fork+0x35/0x40 > > > > <3>[16097.962758] INFO: task kworker/25:1:1747 blocked for more than 122 seconds. > > > > <3>[16097.963233] Tainted: G E 5.7.0-rc5+ #80 > > > > <3>[16097.963792] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > > > <6>[16097.964298] kworker/25:1 D 0 1747 2 0x80004080 > > > > <6>[16097.964325] Workqueue: ceph-msgr ceph_con_workfn [libceph] > > > > <4>[16097.964331] Call Trace: > > > > <4>[16097.964340] ? __schedule+0x276/0x6e0 > > > > <4>[16097.964344] ? schedule+0x40/0xb0 > > > > <4>[16097.964347] ? schedule_preempt_disabled+0xa/0x10 > > > > <4>[16097.964351] ? __mutex_lock.isra.8+0x2b5/0x4a0 > > > > <4>[16097.964376] ? handle_reply+0x33f/0x6f0 [ceph] > > > > <4>[16097.964407] ? dispatch+0xa6/0xbc0 [ceph] > > > > <4>[16097.964429] ? read_partial_message+0x214/0x770 [libceph] > > > > <4>[16097.964449] ? try_read+0x78b/0x11e0 [libceph] > > > > <4>[16097.964454] ? __switch_to_asm+0x40/0x70 > > > > <4>[16097.964458] ? __switch_to_asm+0x34/0x70 > > > > <4>[16097.964461] ? __switch_to_asm+0x40/0x70 > > > > <4>[16097.964465] ? __switch_to_asm+0x34/0x70 > > > > <4>[16097.964470] ? __switch_to_asm+0x40/0x70 > > > > <4>[16097.964489] ? ceph_con_workfn+0x130/0x5e0 [libceph] > > > > <4>[16097.964494] ? process_one_work+0x1ad/0x370 > > > > <4>[16097.964498] ? worker_thread+0x30/0x390 > > > > <4>[16097.964501] ? create_worker+0x1a0/0x1a0 > > > > <4>[16097.964506] ? kthread+0x112/0x130 > > > > <4>[16097.964511] ? kthread_park+0x80/0x80 > > > > <4>[16097.964516] ? ret_from_fork+0x35/0x40 > > > > > > > > URL: https://tracker.ceph.com/issues/45609 > > > > Reported-by: "Yan, Zheng" <zyan@redhat.com> > > > > Signed-off-by: Xiubo Li <xiubli@redhat.com> > > > > --- > > > > fs/ceph/mds_client.c | 4 ++-- > > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c > > > > index 6c283c5..0e0ab01 100644 > > > > --- a/fs/ceph/mds_client.c > > > > +++ b/fs/ceph/mds_client.c > > > > @@ -3769,8 +3769,6 @@ static int encode_snap_realms(struct ceph_mds_client *mdsc, > > > > * recovering MDS might have. > > > > * > > > > * This is a relatively heavyweight operation, but it's rare. > > > > - * > > > > - * called with mdsc->mutex held. > > > > */ > > > > static void send_mds_reconnect(struct ceph_mds_client *mdsc, > > > > struct ceph_mds_session *session) > > > > @@ -4024,7 +4022,9 @@ static void check_new_map(struct ceph_mds_client *mdsc, > > > > oldstate != CEPH_MDS_STATE_STARTING) > > > > pr_info("mds%d recovery completed\n", s->s_mds); > > > > kick_requests(mdsc, i); > > > > + mutex_unlock(&mdsc->mutex); > > > > mutex_lock(&s->s_mutex); > > > > + mutex_lock(&mdsc->mutex); > > > > ceph_kick_flushing_caps(mdsc, s); > > > > mutex_unlock(&s->s_mutex); > > > > wake_up_session_caps(s, RECONNECT); > > > > > > Good catch. Merged into testing branch. > > These stack traces take up the entire screen and provide very > > little information. Since this is a simple lock ordering issue, > > could they be replaced with a short description and perhaps a > > reference to mds_client.h where lock dependencies are documented? > > Hmm, it makes sense. > > Should I post a V2 ? Or will Jeff do the fix on the testing branch ? > I went ahead and just fixed it in the testing branch. Let me know if you want to revise the verbiage.
On 2020/5/27 21:47, Jeff Layton wrote: > On Wed, 2020-05-27 at 20:47 +0800, Xiubo Li wrote: >> On 2020/5/27 20:25, Ilya Dryomov wrote: >>> On Wed, May 20, 2020 at 1:44 PM Jeff Layton <jlayton@kernel.org> wrote: >>>> On Wed, 2020-05-20 at 03:51 -0400, xiubli@redhat.com wrote: >>>>> From: Xiubo Li <xiubli@redhat.com> >>>>> >>>>> The call trace: >>>>> >>>>> <6>[15981.740583] libceph: mon2 (1)10.72.36.245:40083 session lost, hunting for new mon >>>>> <3>[16097.960293] INFO: task kworker/18:1:32111 blocked for more than 122 seconds. >>>>> <3>[16097.960860] Tainted: G E 5.7.0-rc5+ #80 >>>>> <3>[16097.961332] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >>>>> <6>[16097.961868] kworker/18:1 D 0 32111 2 0x80004080 >>>>> <6>[16097.962151] Workqueue: ceph-msgr ceph_con_workfn [libceph] >>>>> <4>[16097.962188] Call Trace: >>>>> <4>[16097.962353] ? __schedule+0x276/0x6e0 >>>>> <4>[16097.962359] ? schedule+0x40/0xb0 >>>>> <4>[16097.962364] ? schedule_preempt_disabled+0xa/0x10 >>>>> <4>[16097.962368] ? __mutex_lock.isra.8+0x2b5/0x4a0 >>>>> <4>[16097.962460] ? kick_requests+0x21/0x100 [ceph] >>>>> <4>[16097.962485] ? ceph_mdsc_handle_mdsmap+0x19c/0x5f0 [ceph] >>>>> <4>[16097.962503] ? extra_mon_dispatch+0x34/0x40 [ceph] >>>>> <4>[16097.962523] ? extra_mon_dispatch+0x34/0x40 [ceph] >>>>> <4>[16097.962580] ? dispatch+0x77/0x930 [libceph] >>>>> <4>[16097.962602] ? try_read+0x78b/0x11e0 [libceph] >>>>> <4>[16097.962619] ? __switch_to_asm+0x40/0x70 >>>>> <4>[16097.962623] ? __switch_to_asm+0x34/0x70 >>>>> <4>[16097.962627] ? __switch_to_asm+0x40/0x70 >>>>> <4>[16097.962631] ? __switch_to_asm+0x34/0x70 >>>>> <4>[16097.962635] ? __switch_to_asm+0x40/0x70 >>>>> <4>[16097.962654] ? ceph_con_workfn+0x130/0x5e0 [libceph] >>>>> <4>[16097.962713] ? process_one_work+0x1ad/0x370 >>>>> <4>[16097.962717] ? worker_thread+0x30/0x390 >>>>> <4>[16097.962722] ? create_worker+0x1a0/0x1a0 >>>>> <4>[16097.962737] ? kthread+0x112/0x130 >>>>> <4>[16097.962742] ? kthread_park+0x80/0x80 >>>>> <4>[16097.962747] ? ret_from_fork+0x35/0x40 >>>>> <3>[16097.962758] INFO: task kworker/25:1:1747 blocked for more than 122 seconds. >>>>> <3>[16097.963233] Tainted: G E 5.7.0-rc5+ #80 >>>>> <3>[16097.963792] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >>>>> <6>[16097.964298] kworker/25:1 D 0 1747 2 0x80004080 >>>>> <6>[16097.964325] Workqueue: ceph-msgr ceph_con_workfn [libceph] >>>>> <4>[16097.964331] Call Trace: >>>>> <4>[16097.964340] ? __schedule+0x276/0x6e0 >>>>> <4>[16097.964344] ? schedule+0x40/0xb0 >>>>> <4>[16097.964347] ? schedule_preempt_disabled+0xa/0x10 >>>>> <4>[16097.964351] ? __mutex_lock.isra.8+0x2b5/0x4a0 >>>>> <4>[16097.964376] ? handle_reply+0x33f/0x6f0 [ceph] >>>>> <4>[16097.964407] ? dispatch+0xa6/0xbc0 [ceph] >>>>> <4>[16097.964429] ? read_partial_message+0x214/0x770 [libceph] >>>>> <4>[16097.964449] ? try_read+0x78b/0x11e0 [libceph] >>>>> <4>[16097.964454] ? __switch_to_asm+0x40/0x70 >>>>> <4>[16097.964458] ? __switch_to_asm+0x34/0x70 >>>>> <4>[16097.964461] ? __switch_to_asm+0x40/0x70 >>>>> <4>[16097.964465] ? __switch_to_asm+0x34/0x70 >>>>> <4>[16097.964470] ? __switch_to_asm+0x40/0x70 >>>>> <4>[16097.964489] ? ceph_con_workfn+0x130/0x5e0 [libceph] >>>>> <4>[16097.964494] ? process_one_work+0x1ad/0x370 >>>>> <4>[16097.964498] ? worker_thread+0x30/0x390 >>>>> <4>[16097.964501] ? create_worker+0x1a0/0x1a0 >>>>> <4>[16097.964506] ? kthread+0x112/0x130 >>>>> <4>[16097.964511] ? kthread_park+0x80/0x80 >>>>> <4>[16097.964516] ? ret_from_fork+0x35/0x40 >>>>> >>>>> URL: https://tracker.ceph.com/issues/45609 >>>>> Reported-by: "Yan, Zheng" <zyan@redhat.com> >>>>> Signed-off-by: Xiubo Li <xiubli@redhat.com> >>>>> --- >>>>> fs/ceph/mds_client.c | 4 ++-- >>>>> 1 file changed, 2 insertions(+), 2 deletions(-) >>>>> >>>>> diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c >>>>> index 6c283c5..0e0ab01 100644 >>>>> --- a/fs/ceph/mds_client.c >>>>> +++ b/fs/ceph/mds_client.c >>>>> @@ -3769,8 +3769,6 @@ static int encode_snap_realms(struct ceph_mds_client *mdsc, >>>>> * recovering MDS might have. >>>>> * >>>>> * This is a relatively heavyweight operation, but it's rare. >>>>> - * >>>>> - * called with mdsc->mutex held. >>>>> */ >>>>> static void send_mds_reconnect(struct ceph_mds_client *mdsc, >>>>> struct ceph_mds_session *session) >>>>> @@ -4024,7 +4022,9 @@ static void check_new_map(struct ceph_mds_client *mdsc, >>>>> oldstate != CEPH_MDS_STATE_STARTING) >>>>> pr_info("mds%d recovery completed\n", s->s_mds); >>>>> kick_requests(mdsc, i); >>>>> + mutex_unlock(&mdsc->mutex); >>>>> mutex_lock(&s->s_mutex); >>>>> + mutex_lock(&mdsc->mutex); >>>>> ceph_kick_flushing_caps(mdsc, s); >>>>> mutex_unlock(&s->s_mutex); >>>>> wake_up_session_caps(s, RECONNECT); >>>> Good catch. Merged into testing branch. >>> These stack traces take up the entire screen and provide very >>> little information. Since this is a simple lock ordering issue, >>> could they be replaced with a short description and perhaps a >>> reference to mds_client.h where lock dependencies are documented? >> Hmm, it makes sense. >> >> Should I post a V2 ? Or will Jeff do the fix on the testing branch ? >> > I went ahead and just fixed it in the testing branch. Let me know if you want to revise the verbiage. > That looks good to me. Thanks Jeff. BRs Xiubo
diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index 6c283c5..0e0ab01 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -3769,8 +3769,6 @@ static int encode_snap_realms(struct ceph_mds_client *mdsc, * recovering MDS might have. * * This is a relatively heavyweight operation, but it's rare. - * - * called with mdsc->mutex held. */ static void send_mds_reconnect(struct ceph_mds_client *mdsc, struct ceph_mds_session *session) @@ -4024,7 +4022,9 @@ static void check_new_map(struct ceph_mds_client *mdsc, oldstate != CEPH_MDS_STATE_STARTING) pr_info("mds%d recovery completed\n", s->s_mds); kick_requests(mdsc, i); + mutex_unlock(&mdsc->mutex); mutex_lock(&s->s_mutex); + mutex_lock(&mdsc->mutex); ceph_kick_flushing_caps(mdsc, s); mutex_unlock(&s->s_mutex); wake_up_session_caps(s, RECONNECT);