From patchwork Wed Mar 26 03:54:33 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Alex Elder X-Patchwork-Id: 3891881 Return-Path: X-Original-To: patchwork-ceph-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 9E7319F2B6 for ; Wed, 26 Mar 2014 03:54:18 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id AA9CE201FA for ; Wed, 26 Mar 2014 03:54:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 791CC201F9 for ; Wed, 26 Mar 2014 03:54:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754672AbaCZDyO (ORCPT ); Tue, 25 Mar 2014 23:54:14 -0400 Received: from mail-qc0-f170.google.com ([209.85.216.170]:57203 "EHLO mail-qc0-f170.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754185AbaCZDyO (ORCPT ); Tue, 25 Mar 2014 23:54:14 -0400 Received: by mail-qc0-f170.google.com with SMTP id e9so1980614qcy.1 for ; Tue, 25 Mar 2014 20:54:13 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:message-id:date:from:user-agent:mime-version:to :cc:subject:references:in-reply-to:content-type :content-transfer-encoding; bh=Q11mSoX6f0SS9XL9mVcKTYBoIXquDP1wyq/90ELY1Vc=; b=Sx00xi2LOFZ9YCI1uzeILRdXjcGGi601tp0VrZfC6CdyupgaTJ9whVQDexibf+CSPx zBT9YskbcbP3+qjSt7KZFj6Jc8TFgZYwkeAwDCPPCN4HANdHIqUcM2B2eyJYWQIzrNXW GeHzBtiTYiyQaImC11/b+jo7m01ANtewn5OzUz1Yz4HckQ5NiFCJt3AM65PYm2+iFRfb S+gy5iXLebJ5XxRTduEbGKOAPty/5mIBlJDN6X7wuO3xcVTtZDzDvXwJiqtfTuul+UWz zRbQWzzKlpEeeTMh7azDyKDsmyIW2TQ6quwHcU7TPvZgjpualm4ALERbx0bQN8Ti6puG OlLg== X-Gm-Message-State: ALoCoQkgNqWvqHhiYv67a2Au2wz/s48YVISto1HY9hssuTZiXcBbkzp1MPWvwWnIesZvTfNZcGWA X-Received: by 10.140.24.151 with SMTP id 23mr83734155qgr.11.1395806053259; Tue, 25 Mar 2014 20:54:13 -0700 (PDT) Received: from [172.22.22.4] (c-71-195-31-37.hsd1.mn.comcast.net. [71.195.31.37]) by mx.google.com with ESMTPSA id b3sm36412562qae.2.2014.03.25.20.54.12 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Tue, 25 Mar 2014 20:54:12 -0700 (PDT) Message-ID: <53324F79.1080108@ieee.org> Date: Tue, 25 Mar 2014 22:54:33 -0500 From: Alex Elder User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.3.0 MIME-Version: 1.0 To: Olivier Bonvalet CC: Ilya Dryomov , Ceph Development Subject: Re: Issue #5876 : assertion failure in rbd_img_obj_callback() References: <1395736765.2823.29.camel@localhost> <1395773582.2076.10.camel@localhost> <5331D2E8.6060002@ieee.org> <1395778894.2076.12.camel@localhost> <1395780835.2076.15.camel@localhost> <1395781847.2076.21.camel@localhost> <1395782577.2076.23.camel@localhost> <1395783675.2076.26.camel@localhost> <1395784476.2076.28.camel@localhost> <1395785839.2076.30.camel@localhost> <5332075F.8080105@ieee.org> <1395788695.2076.35.camel@localhost> <53321896.1080606@ieee.org> <1395797596.2076.43.camel@localhost> <1395798658.2076.45.camel@localhost> <5332339A.8030000@ieee.org> <1395801625.2076.52.camel@localhost> <53323EA5.6010506@ieee.org> <1395801940.2076.54.camel@localhost> In-Reply-To: <1395801940.2076.54.camel@localhost> Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org X-Spam-Status: No, score=-7.3 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On 03/25/2014 09:45 PM, Olivier Bonvalet wrote: > Le mardi 25 mars 2014 à 21:42 -0500, Alex Elder a écrit : >> PS I thought you said you were going to stop for the night! > > Yes, I would love ! But my phone doesn't stop ring about ceph crash :D > OK, one more thing to try and I'm going to bed. I'm hoping that an image request spanning multiple objects is an unusual case, enough so that the following won't overwhelm with output. I'd avoid putting it on a production system (that's the case for all this testing, really) if possible. Basically I'm trying to catch an image object request being either submitted more than once, or completing more than once. So if an image request has more than one object request I produce some informative output. This patch fixes two warnings in the previous debug patch, and adds to it (so use it instead of the last one). If you get a chance to try this I'll want to see the output. But first, I shall sleep. Thank you. -Alex --- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Index: b/drivers/block/rbd.c =================================================================== --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -1484,6 +1484,10 @@ static void rbd_img_request_complete(str xferred += obj_request->xferred; img_request->xferred = xferred; } + if (img_request->obj_request_count > 1) + printk("%s: img_request %p count %u result %d xferred %llu\n", + __func__, img_request, img_request->obj_request_count, + img_request->result, img_request->xferred); if (img_request->callback) img_request->callback(img_request); @@ -2129,9 +2133,43 @@ static void rbd_img_obj_callback(struct rbd_assert(which != BAD_WHICH); rbd_assert(which < img_request->obj_request_count); + if (img_request->obj_request_count > 1) + printk("%s: obj_request %p (%llu/%llu)\n", __func__, + obj_request, obj_request->offset, obj_request->length); spin_lock_irq(&img_request->completion_lock); if (which > img_request->next_completion) goto out; + if (which != img_request->next_completion) { + printk("%s: bad image object request information:\n", __func__); + printk("obj_request %p\n", obj_request); + printk(" ->object_name <%s>\n", obj_request->object_name); + printk(" ->offset %llu\n", obj_request->offset); + printk(" ->length %llu\n", obj_request->length); + printk(" ->type 0x%x\n", (u32)obj_request->type); + printk(" ->flags 0x%lx\n", obj_request->flags); + printk(" ->img_request %p\n", obj_request->img_request); + printk(" ->which %u\n", obj_request->which); + printk(" ->xferred %llu\n", obj_request->xferred); + printk(" ->result %d\n", obj_request->result); + printk(" ->kref %d\n", + atomic_read(&obj_request->kref.refcount)); + + printk("img_request %p\n", img_request); + printk(" ->snap 0x%016llx\n", img_request->snap_id); + printk(" ->offset %llu\n", img_request->offset); + printk(" ->length %llu\n", img_request->length); + printk(" ->flags 0x%lx\n", img_request->flags); + printk(" ->obj_request_count %u\n", + img_request->obj_request_count); + printk(" ->next_completion %u\n", + img_request->next_completion); + printk(" ->xferred %llu\n", img_request->xferred); + printk(" ->result %d\n", img_request->result); + printk(" ->obj_requests head %p\n", + img_request->obj_requests.next); + printk(" ->kref %d\n", + atomic_read(&img_request->kref.refcount)); + } rbd_assert(which == img_request->next_completion); for_each_obj_request_from(img_request, obj_request) { @@ -2697,11 +2735,21 @@ static int rbd_img_request_submit(struct { struct rbd_obj_request *obj_request; struct rbd_obj_request *next_obj_request; + bool verbose = false; dout("%s: img %p\n", __func__, img_request); + if (img_request->obj_request_count > 1) { + printk("%s: img_request %p count %u (%llu/%llu)\n", __func__, + img_request, img_request->offset, img_request->length); + verbose = true; + } for_each_obj_request_safe(img_request, obj_request, next_obj_request) { int ret; + if (verbose) + printk("%s: obj_request %p (%llu/%llu)\n", __func__, + obj_request, obj_request->offset, + obj_request->length); ret = rbd_img_obj_request_submit(obj_request); if (ret) return ret;