From patchwork Thu Feb 17 23:31:43 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jim Schutt X-Patchwork-Id: 572561 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by demeter1.kernel.org (8.14.4/8.14.3) with ESMTP id p1HNVswX003640 for ; Thu, 17 Feb 2011 23:31:55 GMT Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752335Ab1BQXbx (ORCPT ); Thu, 17 Feb 2011 18:31:53 -0500 Received: from sentry-three.sandia.gov ([132.175.109.17]:55713 "EHLO sentry-three.sandia.gov" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751627Ab1BQXbw (ORCPT ); Thu, 17 Feb 2011 18:31:52 -0500 X-WSS-ID: 0LGSC0Z-0C-CLF-02 X-M-MSG: Received: from sentry.sandia.gov (sentry.sandia.gov [132.175.109.21]) by sentry-three.sandia.gov (Postfix) with ESMTP id 1B54D54686C; Thu, 17 Feb 2011 16:31:46 -0700 (MST) Received: from [134.253.165.160] by sentry.sandia.gov with ESMTP (SMTP Relay 01 (Email Firewall v6.3.2)); Thu, 17 Feb 2011 16:31:43 -0700 X-Server-Uuid: 6BFC7783-7E22-49B4-B610-66D6BE496C0E Received: from [134.253.4.20] (134.253.4.20) by smtps.sandia.gov ( 134.253.165.15) with Microsoft SMTP Server (TLS) id 8.2.254.0; Thu, 17 Feb 2011 16:31:43 -0700 Subject: Re: cosd multi-second stalls cause "wrongly marked me down" From: "Jim Schutt" To: "Sage Weil" cc: "Gregory Farnum" , "ceph-devel@vger.kernel.org" In-Reply-To: References: <1297891508.25491.120.camel@sale659.sandia.gov> <75157CFDA63D45458FC47FB7BA6CB974@gmail.com> <1297893011.25491.124.camel@sale659.sandia.gov> <1297957574.25491.152.camel@sale659.sandia.gov> Date: Thu, 17 Feb 2011 16:31:43 -0700 Message-ID: <1297985503.25491.175.camel@sale659.sandia.gov> MIME-Version: 1.0 X-Mailer: Evolution 2.12.3 (2.12.3-19.el5) X-TMWD-Spam-Summary: TS=20110217233146; ID=1; SEV=2.3.1; DFV=B2011021723; IFV=NA; AIF=B2011021723; RPD=5.03.0010; ENG=NA; RPDID=7374723D303030312E30413031303230382E34443544414645332E303034442C73733D312C6667733D30; CAT=NONE; CON=NONE; SIG=AAACADrBBQCbCiIAAAAAAAAAAAAAAAAAAAB9 X-MMS-Spam-Filter-ID: B2011021723_5.03.0010 X-WSS-ID: 614370554CO3623731-01-01 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org X-Greylist: IP, sender and recipient auto-whitelisted, not delayed by milter-greylist-4.2.6 (demeter1.kernel.org [140.211.167.41]); Thu, 17 Feb 2011 23:32:05 +0000 (UTC) diff --git a/src/osd/OSD.cc b/src/osd/OSD.cc index 76b8af8..dab6054 100644 --- a/src/osd/OSD.cc +++ b/src/osd/OSD.cc @@ -1530,8 +1530,10 @@ void OSD::tick() // periodically kick recovery work queue recovery_tp.kick(); - + + dout(20) << "tick getting read lock on map_lock" << dendl; map_lock.get_read(); + dout(20) << "tick got read lock on map_lock" << dendl; if (scrub_should_schedule()) { sched_scrub(); @@ -1544,11 +1546,13 @@ void OSD::tick() check_replay_queue(); // mon report? + dout(20) << "tick sending mon report" << dendl; utime_t now = g_clock.now(); if (now - last_mon_report > g_conf.osd_mon_report_interval) do_mon_report(); // remove stray pgs? + dout(20) << "tick removing stray pgs" << dendl; remove_list_lock.Lock(); for (map > >::iterator p = remove_list.begin(); p != remove_list.end(); @@ -1566,19 +1570,23 @@ void OSD::tick() map_lock.put_read(); + dout(20) << "tick sending log to logclient" << dendl; logclient.send_log(); + dout(20) << "tick arming timer for next tick" << dendl; timer.add_event_after(1.0, new C_Tick(this)); // only do waiters if dispatch() isn't currently running. (if it is, // it'll do the waiters, and doing them here may screw up ordering // of op_queue vs handle_osd_map.) + dout(20) << "tick checking dispatch queue status" << dendl; if (!dispatch_running) { dispatch_running = true; do_waiters(); dispatch_running = false; dispatch_cond.Signal(); } + dout(20) << "tick done" << dendl; } Check out the result: