From patchwork Wed Aug 7 03:35:17 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Harper X-Patchwork-Id: 2839763 Return-Path: X-Original-To: patchwork-ceph-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 22ADFBF535 for ; Wed, 7 Aug 2013 03:57:15 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 3525D201EB for ; Wed, 7 Aug 2013 03:57:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 571F0201E9 for ; Wed, 7 Aug 2013 03:57:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756459Ab3HGD5L (ORCPT ); Tue, 6 Aug 2013 23:57:11 -0400 Received: from smtp2.bendigoit.com.au ([203.16.207.99]:37070 "EHLO smtp2.bendigoit.com.au" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756336Ab3HGD5K convert rfc822-to-8bit (ORCPT ); Tue, 6 Aug 2013 23:57:10 -0400 Received: from bitcom1.int.sbss.com.au ([2001:388:e000:712:a5ca:4fd3:14f:ad5d]) by smtp2.bendigoit.com.au with esmtp (Exim 4.80) (envelope-from ) id 1V6uWt-0003LA-Kb for ceph-devel@vger.kernel.org; Wed, 07 Aug 2013 13:35:19 +1000 Received: from BITCOM1.int.sbss.com.au ([fe80::a5ca:4fd3:14f:ad5d]) by BITCOM1.int.sbss.com.au ([fe80::a5ca:4fd3:14f:ad5d%12]) with mapi id 14.01.0438.000; Wed, 7 Aug 2013 13:35:18 +1000 From: James Harper To: "ceph-devel@vger.kernel.org" Subject: RE: bug in /etc/init.d/ceph debian Thread-Topic: bug in /etc/init.d/ceph debian Thread-Index: Ac6P86bQ549/d+QLQtiDQuvxpeYVNADKboVQ Date: Wed, 7 Aug 2013 03:35:17 +0000 Message-ID: <6035A0D088A63A46850C3988ED045A4B62E82A96@BITCOM1.int.sbss.com.au> References: <6035A0D088A63A46850C3988ED045A4B62E7617E@BITCOM1.int.sbss.com.au> In-Reply-To: <6035A0D088A63A46850C3988ED045A4B62E7617E@BITCOM1.int.sbss.com.au> Accept-Language: en-AU, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [2001:388:e000:712:40ca:9179:f5ac:995] x-tm-as-product-ver: SMEX-10.2.0.3176-7.000.1014-20062.004 x-tm-as-result: No--34.767700-0.000000-31 x-tm-as-user-approved-sender: Yes x-tm-as-user-blocked-sender: No MIME-Version: 1.0 X-Really-From-Bendigo-IT: magichashvalue Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP > > I'm running ceph 0.61.7-1~bpo70+1 and I think there is a bug in > /etc/init.d/ceph > > The heartbeat RA expects that the init.d script will return 3 for "not running", > but if there is no agent (eg mds) defined for that host it will return 0 instead, > so pacemaker thinks the agent is running on a node where it isn't even > defined and presumably would then start doing stonith when it finds it > remains running after a stop command. > > Or maybe that is the correct behaviour of the init.d script and the RA needs > to be modified? > Nobody interested in this? My proposed fix follows this email. Return status is: 0 - everything tested is running 1 - something wrong 3 - something tested is stopped Without this patch, the resource agents report that the service is running if the service is not defined on the host. I'm not sure though if this is the right approach. Maybe the /etc/init.d/ceph should return 0 when checking the status of (say) mon, when there are no mons defined on this host? James --- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html --- ceph.orig 2013-08-07 13:28:25.000000000 +1000 +++ ceph 2013-08-07 13:32:37.000000000 +1000 @@ -170,6 +170,9 @@ get_local_name_list get_name_list "$@" +running=0 +dead=0 +stopped=0 for name in $what; do type=`echo $name | cut -c 1-3` # e.g. 'mon', if $item is 'mon1' id=`echo $name | cut -c 4- | sed 's/^\\.//'` @@ -375,14 +378,15 @@ if daemon_is_running $name ceph-$type $id $pid_file; then echo -n "$name: running " do_cmd "$BINDIR/ceph --admin-daemon $asok version 2>/dev/null" || echo unknown + running=1 elif [ -e "$pid_file" ]; then # daemon is dead, but pid file still exists echo "$name: dead." - EXIT_STATUS=1 + dead=1 else # daemon is dead, and pid file is gone echo "$name: not running." - EXIT_STATUS=3 + stopped=1 fi ;; @@ -430,6 +434,16 @@ esac done +if [ "$command" = "status" ]; then + if [ "$dead" = "1" ]; then + EXIT_STATUS=1 + elif [ "$running" = "1" ]; then + EXIT_STATUS=0 + else + EXIT_STATUS=3 + fi +fi + # activate latent osds? if [ "$command" = "start" ]; then if [ "$*" = "" ] || echo $* | grep -q ^osd\$ ; then