From patchwork Tue Apr 25 09:34:21 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Dan van der Ster X-Patchwork-Id: 9697787 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id DB589601D3 for ; Tue, 25 Apr 2017 09:35:55 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D5568283FB for ; Tue, 25 Apr 2017 09:35:55 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C79802843B; Tue, 25 Apr 2017 09:35:55 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.3 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI, RCVD_IN_SORBS_SPAM, T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0B668283FB for ; Tue, 25 Apr 2017 09:35:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1428770AbdDYJfx (ORCPT ); Tue, 25 Apr 2017 05:35:53 -0400 Received: from mail-oi0-f67.google.com ([209.85.218.67]:35961 "EHLO mail-oi0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1428702AbdDYJfF (ORCPT ); Tue, 25 Apr 2017 05:35:05 -0400 Received: by mail-oi0-f67.google.com with SMTP id a3so28806137oii.3 for ; Tue, 25 Apr 2017 02:35:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vanderster.com; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=UH8avjmUVrybWViNKTAaZMXp0sz28dOMM5Ohsa+Fz68=; b=NbbK0FkBhNUod2fk8svH1Ajs8f7PZWm0+a9VwL3G6WhiRF91ZmOZEjHvq2ZN6XdOhF FcaBenNXD+ckP8UNhHXyibjH8i8oBLJvx9nE8K7rmB82QtvXFQXIXgNdO/zpgNWruN28 vASxzA2lEEuzieco1WanxGrknRjnsBgKV/oaw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=UH8avjmUVrybWViNKTAaZMXp0sz28dOMM5Ohsa+Fz68=; b=qIZRxIu1x7QHp5Z45tQ4Onc/sUSrv2dmXu20bG3IPpVFjEkVZn9kAjySQ5he/JxVFq v0KW73JKl/hOQelO2qNmHT3wYcPjl9h7gyGcF0HfcjjHAgAt682yMUb62ijJb49/jafV CTQjz4sKojhgvP9nKGdxxWDaZGX5sIhTPG76z3QlsFjEl6U6I3419J4dvkoEF6el7iw7 PQ5hMTVNh1blXEnSN/6MApRC29K8yI6AUG6JXcX7365iH2OSih71Gikn6GAc3F1WFZb+ jWib7Hkrg4ubxr/qTWbM3j1pLwJH3PJOOUAJBEBrnGjXeYxdAjQwWksyXNyPLaBrIKoE qotQ== X-Gm-Message-State: AN3rC/6cxQJJwLOpadYtndT4M0LzssGF89crpQPaChDaii+nZ6orS36D XHFDFLgbaNXvgr7Y X-Received: by 10.202.86.18 with SMTP id k18mr15760135oib.175.1493112904265; Tue, 25 Apr 2017 02:35:04 -0700 (PDT) Received: from mail-oi0-f45.google.com (mail-oi0-f45.google.com. [209.85.218.45]) by smtp.gmail.com with ESMTPSA id a40sm9323680oic.11.2017.04.25.02.35.02 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 25 Apr 2017 02:35:02 -0700 (PDT) Received: by mail-oi0-f45.google.com with SMTP id x184so165650950oia.1 for ; Tue, 25 Apr 2017 02:35:02 -0700 (PDT) X-Received: by 10.157.73.149 with SMTP id g21mr4149212otf.98.1493112902125; Tue, 25 Apr 2017 02:35:02 -0700 (PDT) MIME-Version: 1.0 Received: by 10.182.148.65 with HTTP; Tue, 25 Apr 2017 02:34:21 -0700 (PDT) In-Reply-To: References: <319cffed-6ef8-bb3f-186a-77b2be62a2db@suse.com> From: Dan van der Ster Date: Tue, 25 Apr 2017 11:34:21 +0200 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: v12.0.2 Luminous (dev) released To: Abhishek Lekshmanan Cc: "ceph-devel@vger.kernel.org" , ceph-users Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Could this change be the culprit? commit 973829132bf7206eff6c2cf30dd0aa32fb0ce706 Author: Sage Weil Date: Fri Mar 31 09:33:19 2017 -0400 mon/OSDMonitor: spinlock -> std::mutex I think spinlock is dangerous here: we're doing semi-unbounded work (decode). Also seemingly innocuous code like dout macros take mutexes. Signed-off-by: Sage Weil } ... Cheers, Dan On Tue, Apr 25, 2017 at 11:15 AM, Dan van der Ster wrote: > Hi, > > The mon's on my test luminous cluster do not start after upgrading > from 12.0.1 to 12.0.2. Here is the backtrace: > > 0> 2017-04-25 11:06:02.897941 7f467ddd7880 -1 *** Caught signal > (Aborted) ** > in thread 7f467ddd7880 thread_name:ceph-mon > > ceph version 12.0.2 (5a1b6b3269da99a18984c138c23935e5eb96f73e) > 1: (()+0x797e7f) [0x7f467e58ce7f] > 2: (()+0xf370) [0x7f467d18d370] > 3: (gsignal()+0x37) [0x7f467a44f1d7] > 4: (abort()+0x148) [0x7f467a4508c8] > 5: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7f467ad539d5] > 6: (()+0x5e946) [0x7f467ad51946] > 7: (()+0x5e973) [0x7f467ad51973] > 8: (()+0x5eb93) [0x7f467ad51b93] > 9: (ceph::buffer::list::iterator_impl::copy(unsigned int, > char*)+0xa5) [0x7f467e2fc715] > 10: (creating_pgs_t::decode(ceph::buffer::list::iterator&)+0x3c) > [0x7f467e211e8c] > 11: (OSDMonitor::update_from_paxos(bool*)+0x225a) [0x7f467e1cd16a] > 12: (PaxosService::refresh(bool*)+0x1a5) [0x7f467e196335] > 13: (Monitor::refresh_from_paxos(bool*)+0x19b) [0x7f467e12953b] > 14: (Monitor::init_paxos()+0x115) [0x7f467e129975] > 15: (Monitor::preinit()+0x93d) [0x7f467e13b07d] > 16: (main()+0x2518) [0x7f467e07f848] > 17: (__libc_start_main()+0xf5) [0x7f467a43bb35] > 18: (()+0x32671e) [0x7f467e11b71e] > NOTE: a copy of the executable, or `objdump -rdS ` is > needed to interpret this. > > Cheers, Dan > > > On Mon, Apr 24, 2017 at 5:49 PM, Abhishek Lekshmanan wrote: >> This is the third development checkpoint release of Luminous, the next >> long term >> stable release. >> >> Major changes from v12.0.1 >> -------------------------- >> * The original librados rados_objects_list_open (C) and objects_begin >> (C++) object listing API, deprecated in Hammer, has finally been >> removed. Users of this interface must update their software to use >> either the rados_nobjects_list_open (C) and nobjects_begin (C++) API or >> the new rados_object_list_begin (C) and object_list_begin (C++) API >> before updating the client-side librados library to Luminous. >> >> Object enumeration (via any API) with the latest librados version >> and pre-Hammer OSDs is no longer supported. Note that no in-tree >> Ceph services rely on object enumeration via the deprecated APIs, so >> only external librados users might be affected. >> >> The newest (and recommended) rados_object_list_begin (C) and >> object_list_begin (C++) API is only usable on clusters with the >> SORTBITWISE flag enabled (Jewel and later). (Note that this flag is >> required to be set before upgrading beyond Jewel.) >> >> * CephFS clients without the 'p' flag in their authentication capability >> string will no longer be able to set quotas or any layout fields. This >> flag previously only restricted modification of the pool and namespace >> fields in layouts. >> >> * CephFS directory fragmentation (large directory support) is enabled >> by default on new filesystems. To enable it on existing filesystems >> use "ceph fs set allow_dirfrags". >> >> * CephFS will generate a health warning if you have fewer standby daemons >> than it thinks you wanted. By default this will be 1 if you ever had >> a standby, and 0 if you did not. You can customize this using >> ``ceph fs set standby_count_wanted ``. Setting it >> to zero will effectively disable the health check. >> >> * The "ceph mds tell ..." command has been removed. It is superseded >> by "ceph tell mds. ..." >> >> * RGW introduces server side encryption of uploaded objects with 3 >> options for >> the management of encryption keys, automatic encryption (only >> recommended for >> test setups), customer provided keys similar to Amazon SSE KMS >> specification & >> using a key management service (openstack barbician) >> >> For a more detailed changelog, refer to >> http://ceph.com/releases/ceph-v12-0-2-luminous-dev-released/ >> >> Getting Ceph >> ------------ >> >> * Git at git://github.com/ceph/ceph.git >> * Tarball at http://download.ceph.com/tarballs/ceph-12.0.2.tar.gz >> * For packages, see http://docs.ceph.com/docs/master/install/get-packages/ >> * For ceph-deploy, see >> http://docs.ceph.com/docs/master/install/install-ceph-deploy >> * Release sha1: 5a1b6b3269da99a18984c138c23935e5eb96f73e >> >> -- >> Abhishek Lekshmanan >> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, >> HRB 21284 (AG Nürnberg) >> >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html --- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/src/mon/OSDMonitor.cc b/src/mon/OSDMonitor.cc index 543338bdf3..6fa5e8de4b 100644 --- a/src/mon/OSDMonitor.cc +++ b/src/mon/OSDMonitor.cc @@ -245,7 +245,7 @@ void OSDMonitor::update_from_paxos(bool *need_bootstrap) bufferlist bl; mon->store->get(OSD_PG_CREATING_PREFIX, "creating", bl); auto p = bl.begin(); - std::lock_guard l(creating_pgs_lock); + std::lock_guard l(creating_pgs_lock); creating_pgs.decode(p); dout(7) << __func__ << " loading creating_pgs e" << creating_pgs.last_scan_epoch << dendl;