From patchwork Mon May 13 17:58:44 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Elder X-Patchwork-Id: 2559721 Return-Path: X-Original-To: patchwork-ceph-devel@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork2.kernel.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by patchwork2.kernel.org (Postfix) with ESMTP id 40DF9DF2E5 for ; Mon, 13 May 2013 17:58:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753451Ab3EMR6s (ORCPT ); Mon, 13 May 2013 13:58:48 -0400 Received: from mail-oa0-f51.google.com ([209.85.219.51]:55421 "EHLO mail-oa0-f51.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752121Ab3EMR6s (ORCPT ); Mon, 13 May 2013 13:58:48 -0400 Received: by mail-oa0-f51.google.com with SMTP id f4so7945190oah.10 for ; Mon, 13 May 2013 10:58:47 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:message-id:date:from:user-agent:mime-version:to:cc :subject:references:in-reply-to:content-type :content-transfer-encoding:x-gm-message-state; bh=ZvrANGhDslgMERza0XLMj0xTDnP6ZbomUt8Sd0YJ0Z4=; b=hwT/39QJwKulHBBq1J/prT/qGS+CznRrv3Vdt0fpzqsePmM6b3Yv/3tVtDT9AjXidm 1DGr8BnEt2QF8ssyaVXUL9ws3SRJPdcVfauxCGCnH15qHmfQW/EtCOvTyRo619DAmc3u 1/k7nKYHIRc0WLqopsmMCXMyfA14kPoOVeENXJ2xI6GGOp2UKeU2Ja3foon+c3in3ZLL 5A/7ec5wjaQdXRigLV14cqlRlcC7zLDQThdfGujIWXfbiz3yWFyHtEfycL5w6Csrsjid u5rSSl4YdDBQWreIFFTQPAN0kLLdqZqu8lYHEvDDpfjq2zqr2TloND0VWLAG3CvpanAT g7nQ== X-Received: by 10.60.150.243 with SMTP id ul19mr13930653oeb.8.1368467927330; Mon, 13 May 2013 10:58:47 -0700 (PDT) Received: from [172.22.22.4] (c-71-195-31-37.hsd1.mn.comcast.net. [71.195.31.37]) by mx.google.com with ESMTPSA id c20sm17686245oez.4.2013.05.13.10.58.45 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 13 May 2013 10:58:45 -0700 (PDT) Message-ID: <519129D4.30109@inktank.com> Date: Mon, 13 May 2013 12:58:44 -0500 From: Alex Elder User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130329 Thunderbird/17.0.5 MIME-Version: 1.0 To: Josh Durgin CC: ceph-devel@vger.kernel.org Subject: [PATCH 1/5, v2] rbd: get parent info on refresh References: <518E831E.6000206@inktank.com> <518E8357.7010506@inktank.com> <518EB11D.9050600@inktank.com> In-Reply-To: <518EB11D.9050600@inktank.com> X-Gm-Message-State: ALoCoQkGmhHTwBWgOc/xifd0kUHiRVJtT4xiV/SJTeJlloKe7d/v3YI4geiq1usv9Srl+Fo0V0MT Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Get parent info for format 2 images on every refresh (rather than just during the initial probe). This will be needed to detect the disappearance of the parent image in the event a mapped image becomes unlayered (i.e., flattened). Avoid leaking the previous parent spec on the second and subsequent times this information is requested by dropping the previous one (if any) before updating it. (Also, extract the pool id into a local variable before assigning it into the parent spec.) Switch to using a non-zero parent overlap value rather than the existence of a parent (a non-null parent_spec pointer) to determine whether to mark a request layered. It will soon be possible for a layered image to become unlayered while a request is in flight. This means that the layered flag for an image request indicates that there was a non-zero parent overlap at the time the image request was created. The parent overlap can change thereafter, which may lead to special handling at request submission or completion time. This and the next several patches are related to: http://tracker.ceph.com/issues/3763 NOTE: If an error occurs while refreshing the parent info (i.e., requesting it after initial probe), the old parent info will persist. This is not really correct, and is a scenario that needs to be addressed. For now we'll assert that the failure mode is unlikely, but the issue has been documented in tracker issue 5040. Signed-off-by: Alex Elder Reviewed-by: Josh Durgin --- v2: no longer clean up rbd_dev and header on error in refresh drivers/block/rbd.c | 67 ++++++++++++++++++++++++++++----------------------- 1 file changed, 37 insertions(+), 30 deletions(-) int ret; @@ -3643,18 +3644,19 @@ static int rbd_dev_v2_parent_info(struct rbd_device *rbd_dev) p = reply_buf; end = reply_buf + ret; ret = -ERANGE; - ceph_decode_64_safe(&p, end, parent_spec->pool_id, out_err); - if (parent_spec->pool_id == CEPH_NOPOOL) + ceph_decode_64_safe(&p, end, pool_id, out_err); + if (pool_id == CEPH_NOPOOL) goto out; /* No parent? No problem. */ /* The ceph file layout needs to fit pool id in 32 bits */ ret = -EIO; - if (parent_spec->pool_id > (u64)U32_MAX) { + if (pool_id > (u64)U32_MAX) { rbd_warn(NULL, "parent pool id too large (%llu > %u)\n", - (unsigned long long)parent_spec->pool_id, U32_MAX); + (unsigned long long)pool_id, U32_MAX); goto out_err; } + parent_spec->pool_id = pool_id; image_id = ceph_extract_encoded_string(&p, end, NULL, GFP_KERNEL); if (IS_ERR(image_id)) { @@ -3666,6 +3668,7 @@ static int rbd_dev_v2_parent_info(struct rbd_device *rbd_dev) ceph_decode_64_safe(&p, end, overlap, out_err); if (overlap) { + rbd_spec_put(rbd_dev->parent_spec); rbd_dev->parent_spec = parent_spec; parent_spec = NULL; /* rbd_dev now owns this */ rbd_dev->parent_overlap = overlap; @@ -4034,17 +4037,43 @@ static int rbd_dev_v2_header_info(struct rbd_device *rbd_dev) goto out; } + /* + * If the image supports layering, get the parent info. We + * need to probe the first time regardless. Thereafter we + * only need to if there's a parent, to see if it has + * disappeared due to the mapped image getting flattened. + */ + if (rbd_dev->header.features & RBD_FEATURE_LAYERING && + (first_time || rbd_dev->parent_spec)) { + bool warn; + + ret = rbd_dev_v2_parent_info(rbd_dev); + if (ret) + goto out; + + /* + * Print a warning if this is the initial probe and + * the image has a parent. Don't print it if the + * image now being probed is itself a parent. We + * can tell at this point because we won't know its + * pool name yet (just its pool id). + */ + warn = rbd_dev->parent_spec && rbd_dev->spec->pool_name; + if (first_time && warn) + rbd_warn(rbd_dev, "WARNING: kernel layering " + "is EXPERIMENTAL!"); + } + ret = rbd_dev_v2_image_size(rbd_dev); if (ret) goto out; + if (rbd_dev->spec->snap_id == CEPH_NOSNAP) if (rbd_dev->mapping.size != rbd_dev->header.image_size) rbd_dev->mapping.size = rbd_dev->header.image_size; ret = rbd_dev_v2_snap_context(rbd_dev); dout("rbd_dev_v2_snap_context returned %d\n", ret); - if (ret) - goto out; out: up_write(&rbd_dev->header_rwsem); @@ -4498,24 +4527,6 @@ static int rbd_dev_v2_header_onetime(struct rbd_device *rbd_dev) if (ret) goto out_err; - /* If the image supports layering, get the parent info */ - - if (rbd_dev->header.features & RBD_FEATURE_LAYERING) { - ret = rbd_dev_v2_parent_info(rbd_dev); - if (ret) - goto out_err; - /* - * Print a warning if this image has a parent. - * Don't print it if the image now being probed - * is itself a parent. We can tell at this point - * because we won't know its pool name yet (just its - * pool id). - */ - if (rbd_dev->parent_spec && rbd_dev->spec->pool_name) - rbd_warn(rbd_dev, "WARNING: kernel layering " - "is EXPERIMENTAL!"); - } - /* If the image supports fancy striping, get its parameters */ if (rbd_dev->header.features & RBD_FEATURE_STRIPINGV2) { @@ -4527,11 +4538,7 @@ static int rbd_dev_v2_header_onetime(struct rbd_device *rbd_dev) return 0; out_err: - rbd_dev->parent_overlap = 0; - rbd_spec_put(rbd_dev->parent_spec); - rbd_dev->parent_spec = NULL; - kfree(rbd_dev->header_name); - rbd_dev->header_name = NULL; + rbd_dev->header.features = 0; kfree(rbd_dev->header.object_prefix); rbd_dev->header.object_prefix = NULL; diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index b67ecda..fcef63c 100644 --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -1873,7 +1873,7 @@ static struct rbd_img_request *rbd_img_request_create( } if (child_request) img_request_child_set(img_request); - if (rbd_dev->parent_spec) + if (rbd_dev->parent_overlap) img_request_layered_set(img_request); spin_lock_init(&img_request->completion_lock); img_request->next_completion = 0; @@ -3613,6 +3613,7 @@ static int rbd_dev_v2_parent_info(struct rbd_device *rbd_dev) __le64 snapid; void *p; void *end; + u64 pool_id; char *image_id; u64 overlap;