From patchwork Mon Sep 25 19:40:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ilya Dryomov X-Patchwork-Id: 13398326 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3EF29CD54BF for ; Mon, 25 Sep 2023 19:41:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232340AbjIYTlF (ORCPT ); Mon, 25 Sep 2023 15:41:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39406 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231450AbjIYTlE (ORCPT ); Mon, 25 Sep 2023 15:41:04 -0400 Received: from mail-lf1-x12e.google.com (mail-lf1-x12e.google.com [IPv6:2a00:1450:4864:20::12e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 94E08101 for ; Mon, 25 Sep 2023 12:40:57 -0700 (PDT) Received: by mail-lf1-x12e.google.com with SMTP id 2adb3069b0e04-50325ce89e9so12036230e87.0 for ; Mon, 25 Sep 2023 12:40:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1695670856; x=1696275656; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=uluF7rC4DV182er+Ikvn30IR/aP2+rtAmb+5lqBFiGI=; b=PxAHOJ1E6ONLFDOqbYg+KzYYfk1WPgob6ZHnJ7MG/t/grXMLiqnlpPiT2gEvu4srTE S76WbQqFkpOqh0jFUtm47okLEmAy2x3DxtRETkvz/+2I/CQoZX/vgTL8bLBlij/fTJlO WJB3GN4/WPIcFHYFDhanarCkFO5z918eZk4+LpfpGkkURMH9/sZilBm6+tmINEy6D3Uu bRcVbON14jxMxd9T/qR1vo+6w+ioTGkjw46oXn90nDOQnEbN2H6+Xj/NXnoDRb65iSi1 jwluhJu+9I5bgqrnMRx2mTZwQ+prCQ00thsyEHwDpUgiECn1nGZNTb3Wsw7SHmSLoZQb FPSQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695670856; x=1696275656; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=uluF7rC4DV182er+Ikvn30IR/aP2+rtAmb+5lqBFiGI=; b=LQCtXX5FUmR/DL/KiqllmqEm7A2o0hsOSCtI6XpjHXpIKmyLcqiwmFk9j6rjhEFi9k 76R2jVisqwIaAYVPiROU5yZt6DnXGg1FFM5MqESyEXGd0kP5c0/ZV/d2EzuOyRl8MmAJ plokm9Pcji979CeCmGJhB1PZFVhJ1W0G30AQhqcwSdIORf90PhccXx8eI7AnjPHAQ/vA 1dRmuLoezmZHUtKaa0qgG2h6aq5+DRuBrc2ONn8PJubLt9/1qIejp3NbSQuRQNncQuK/ l+N5tTfsd25p3+DmHBIUA70OWo7AOtQx9KTXNbgXUf9fE0wDkeNg3wdg9meYvJYPtYN/ 724A== X-Gm-Message-State: AOJu0YxBkIQhJQ+QQ94PcmWwQC/rhHq2UT624hnV3Tcw1gqx0K4b2QD/ RcUkobfmMJtE/8SK+Nz/SrPFPYLcdNQ= X-Google-Smtp-Source: AGHT+IF7Wj75LpIGuB8uW0xf9PpjdefRiNEGdCcMgMWjn1/NQ8F4IPAgRWr7PyEwlnQiibk1CvWlAw== X-Received: by 2002:a19:6710:0:b0:500:9a15:9054 with SMTP id b16-20020a196710000000b005009a159054mr5809904lfc.20.1695670855556; Mon, 25 Sep 2023 12:40:55 -0700 (PDT) Received: from localhost.localdomain (ip-94-112-167-15.bb.vodafone.cz. [94.112.167.15]) by smtp.gmail.com with ESMTPSA id en13-20020a056402528d00b005340d9d042bsm1762365edb.40.2023.09.25.12.40.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Sep 2023 12:40:54 -0700 (PDT) From: Ilya Dryomov To: ceph-devel@vger.kernel.org Cc: Dongsheng Yang Subject: [PATCH 1/4] rbd: move rbd_dev_refresh() definition Date: Mon, 25 Sep 2023 21:40:31 +0200 Message-ID: <20230925194036.197899-2-idryomov@gmail.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230925194036.197899-1-idryomov@gmail.com> References: <20230925194036.197899-1-idryomov@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Move rbd_dev_refresh() definition further down to avoid having to move struct parent_image_info definition in the next commit. This spares some forward declarations too. Signed-off-by: Ilya Dryomov --- drivers/block/rbd.c | 68 ++++++++++++++++++++++----------------------- 1 file changed, 33 insertions(+), 35 deletions(-) diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index 3de11f077144..5da001f1fe94 100644 --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -633,8 +633,6 @@ static void rbd_dev_remove_parent(struct rbd_device *rbd_dev); static int rbd_dev_refresh(struct rbd_device *rbd_dev); static int rbd_dev_v2_header_onetime(struct rbd_device *rbd_dev); -static int rbd_dev_header_info(struct rbd_device *rbd_dev); -static int rbd_dev_v2_parent_info(struct rbd_device *rbd_dev); static const char *rbd_dev_v2_snap_name(struct rbd_device *rbd_dev, u64 snap_id); static int _rbd_dev_v2_snap_size(struct rbd_device *rbd_dev, u64 snap_id, @@ -4931,39 +4929,6 @@ static void rbd_dev_update_size(struct rbd_device *rbd_dev) } } -static int rbd_dev_refresh(struct rbd_device *rbd_dev) -{ - u64 mapping_size; - int ret; - - down_write(&rbd_dev->header_rwsem); - mapping_size = rbd_dev->mapping.size; - - ret = rbd_dev_header_info(rbd_dev); - if (ret) - goto out; - - /* - * If there is a parent, see if it has disappeared due to the - * mapped image getting flattened. - */ - if (rbd_dev->parent) { - ret = rbd_dev_v2_parent_info(rbd_dev); - if (ret) - goto out; - } - - rbd_assert(!rbd_is_snap(rbd_dev)); - rbd_dev->mapping.size = rbd_dev->header.image_size; - -out: - up_write(&rbd_dev->header_rwsem); - if (!ret && mapping_size != rbd_dev->mapping.size) - rbd_dev_update_size(rbd_dev); - - return ret; -} - static const struct blk_mq_ops rbd_mq_ops = { .queue_rq = rbd_queue_rq, }; @@ -7043,6 +7008,39 @@ static int rbd_dev_image_probe(struct rbd_device *rbd_dev, int depth) return ret; } +static int rbd_dev_refresh(struct rbd_device *rbd_dev) +{ + u64 mapping_size; + int ret; + + down_write(&rbd_dev->header_rwsem); + mapping_size = rbd_dev->mapping.size; + + ret = rbd_dev_header_info(rbd_dev); + if (ret) + goto out; + + /* + * If there is a parent, see if it has disappeared due to the + * mapped image getting flattened. + */ + if (rbd_dev->parent) { + ret = rbd_dev_v2_parent_info(rbd_dev); + if (ret) + goto out; + } + + rbd_assert(!rbd_is_snap(rbd_dev)); + rbd_dev->mapping.size = rbd_dev->header.image_size; + +out: + up_write(&rbd_dev->header_rwsem); + if (!ret && mapping_size != rbd_dev->mapping.size) + rbd_dev_update_size(rbd_dev); + + return ret; +} + static ssize_t do_rbd_add(const char *buf, size_t count) { struct rbd_device *rbd_dev = NULL; From patchwork Mon Sep 25 19:40:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ilya Dryomov X-Patchwork-Id: 13398327 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AD6AFCE79A2 for ; Mon, 25 Sep 2023 19:41:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233234AbjIYTlH (ORCPT ); Mon, 25 Sep 2023 15:41:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39432 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233109AbjIYTlG (ORCPT ); Mon, 25 Sep 2023 15:41:06 -0400 Received: from mail-ed1-x52c.google.com (mail-ed1-x52c.google.com [IPv6:2a00:1450:4864:20::52c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 94353109 for ; Mon, 25 Sep 2023 12:40:58 -0700 (PDT) Received: by mail-ed1-x52c.google.com with SMTP id 4fb4d7f45d1cf-530ce262ab2so7987280a12.1 for ; Mon, 25 Sep 2023 12:40:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1695670857; x=1696275657; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=zlZSp3HAJV0V8N/+T/PE7QSZP4u9kcsw4nRTJbhUX0U=; b=GXqB/QxbE1s6egKfXxDPPNSZgocIicgLIB/bUfGGVEArPzPnK/Vvz6a9YiiHKguW4n u006KrawE40hwjxzWcSahp9yLQzoYWTyXtEJRR5QuXEYeikLB941Jjr9bB9RppN7Ubys 2qNPcQawZdY4KzdiCqjghCLfesL5lewBVati1+Mfya18/qurCxXq/KW0h0z5noZPYT1x sfrz+Mmt5hgnEaQ892BO9OGLmNlEiWqmFZ7/xifsEUEUm+6/3yFZ+u79ifpkxaApXRWw bukz0NxaFfxH6lLhfT1wTgTgen7tfUO0AluNmqSCQ5ojjj1YdIG6hNwIxOfwErLYeiD8 dSNg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695670857; x=1696275657; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=zlZSp3HAJV0V8N/+T/PE7QSZP4u9kcsw4nRTJbhUX0U=; b=i5j4rfBbg1GvFj+pxXr7DU6EUSz6rDvGV47wqoQ2pP1ausodzbLMhYgA1oAYnJUCKr G3e902K3su9HBGPbLmI0NFWLCVzl+23HVovrW3Kq3EKpkluc8swdlYZ7v50vTu/V5fWC fHKyK5hb7dxhgAAcQrLz2gZJgPltseAaK0oAWLEEDgbniftwGlOnb7pBdO/vY5kjB038 fFqYb9xoM+VHw+I4S8lgZoIx6mXn1hsARY4CAOQa/vp+1EBLNAyrhTxrmi9ak4GEcACf R0MS8RTQjmRat9FZXXjLdyUZjEjCYnU2XAkF7RPRrw4knAAlA2URCdf2CPRgnyUMd2As g2LQ== X-Gm-Message-State: AOJu0YzD9ZiesD8UfieZZFfTPd80aiVCQOkCul12Ywkkbm9g6+GR1zK5 uSVxvjUtqEMQ7AZRY00xaePw3/kWu7Y= X-Google-Smtp-Source: AGHT+IHp1Wz5ECp/9DYPPHvMSMLK5STRGTfNG0x8J7ODt9B2ZPhC64OrrEd3U5G7YY78GEPTxlsKgg== X-Received: by 2002:aa7:d948:0:b0:530:dd97:fe78 with SMTP id l8-20020aa7d948000000b00530dd97fe78mr7254699eds.7.1695670856803; Mon, 25 Sep 2023 12:40:56 -0700 (PDT) Received: from localhost.localdomain (ip-94-112-167-15.bb.vodafone.cz. [94.112.167.15]) by smtp.gmail.com with ESMTPSA id en13-20020a056402528d00b005340d9d042bsm1762365edb.40.2023.09.25.12.40.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Sep 2023 12:40:56 -0700 (PDT) From: Ilya Dryomov To: ceph-devel@vger.kernel.org Cc: Dongsheng Yang Subject: [PATCH 2/4] rbd: decouple header read-in from updating rbd_dev->header Date: Mon, 25 Sep 2023 21:40:32 +0200 Message-ID: <20230925194036.197899-3-idryomov@gmail.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230925194036.197899-1-idryomov@gmail.com> References: <20230925194036.197899-1-idryomov@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Make rbd_dev_header_info() populate a passed struct rbd_image_header instead of rbd_dev->header and introduce rbd_dev_update_header() for updating mutable fields in rbd_dev->header upon refresh. The initial read-in of both mutable and immutable fields in rbd_dev_image_probe() passes in rbd_dev->header so no update step is required there. rbd_init_layout() is now called directly from rbd_dev_image_probe() instead of individually in format 1 and format 2 implementations. Signed-off-by: Ilya Dryomov --- drivers/block/rbd.c | 206 ++++++++++++++++++++++++-------------------- 1 file changed, 114 insertions(+), 92 deletions(-) diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index 5da001f1fe94..6ed5520ef303 100644 --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -632,7 +632,8 @@ void rbd_warn(struct rbd_device *rbd_dev, const char *fmt, ...) static void rbd_dev_remove_parent(struct rbd_device *rbd_dev); static int rbd_dev_refresh(struct rbd_device *rbd_dev); -static int rbd_dev_v2_header_onetime(struct rbd_device *rbd_dev); +static int rbd_dev_v2_header_onetime(struct rbd_device *rbd_dev, + struct rbd_image_header *header); static const char *rbd_dev_v2_snap_name(struct rbd_device *rbd_dev, u64 snap_id); static int _rbd_dev_v2_snap_size(struct rbd_device *rbd_dev, u64 snap_id, @@ -993,15 +994,24 @@ static void rbd_init_layout(struct rbd_device *rbd_dev) RCU_INIT_POINTER(rbd_dev->layout.pool_ns, NULL); } +static void rbd_image_header_cleanup(struct rbd_image_header *header) +{ + kfree(header->object_prefix); + ceph_put_snap_context(header->snapc); + kfree(header->snap_sizes); + kfree(header->snap_names); + + memset(header, 0, sizeof(*header)); +} + /* * Fill an rbd image header with information from the given format 1 * on-disk header. */ -static int rbd_header_from_disk(struct rbd_device *rbd_dev, - struct rbd_image_header_ondisk *ondisk) +static int rbd_header_from_disk(struct rbd_image_header *header, + struct rbd_image_header_ondisk *ondisk, + bool first_time) { - struct rbd_image_header *header = &rbd_dev->header; - bool first_time = header->object_prefix == NULL; struct ceph_snap_context *snapc; char *object_prefix = NULL; char *snap_names = NULL; @@ -1068,11 +1078,6 @@ static int rbd_header_from_disk(struct rbd_device *rbd_dev, if (first_time) { header->object_prefix = object_prefix; header->obj_order = ondisk->options.order; - rbd_init_layout(rbd_dev); - } else { - ceph_put_snap_context(header->snapc); - kfree(header->snap_names); - kfree(header->snap_sizes); } /* The remaining fields always get updated (when we refresh) */ @@ -4857,7 +4862,9 @@ static int rbd_obj_read_sync(struct rbd_device *rbd_dev, * return, the rbd_dev->header field will contain up-to-date * information about the image. */ -static int rbd_dev_v1_header_info(struct rbd_device *rbd_dev) +static int rbd_dev_v1_header_info(struct rbd_device *rbd_dev, + struct rbd_image_header *header, + bool first_time) { struct rbd_image_header_ondisk *ondisk = NULL; u32 snap_count = 0; @@ -4905,7 +4912,7 @@ static int rbd_dev_v1_header_info(struct rbd_device *rbd_dev) snap_count = le32_to_cpu(ondisk->snap_count); } while (snap_count != want_count); - ret = rbd_header_from_disk(rbd_dev, ondisk); + ret = rbd_header_from_disk(header, ondisk, first_time); out: kfree(ondisk); @@ -5468,17 +5475,12 @@ static int _rbd_dev_v2_snap_size(struct rbd_device *rbd_dev, u64 snap_id, return 0; } -static int rbd_dev_v2_image_size(struct rbd_device *rbd_dev) -{ - return _rbd_dev_v2_snap_size(rbd_dev, CEPH_NOSNAP, - &rbd_dev->header.obj_order, - &rbd_dev->header.image_size); -} - -static int rbd_dev_v2_object_prefix(struct rbd_device *rbd_dev) +static int rbd_dev_v2_object_prefix(struct rbd_device *rbd_dev, + char **pobject_prefix) { size_t size; void *reply_buf; + char *object_prefix; int ret; void *p; @@ -5496,16 +5498,16 @@ static int rbd_dev_v2_object_prefix(struct rbd_device *rbd_dev) goto out; p = reply_buf; - rbd_dev->header.object_prefix = ceph_extract_encoded_string(&p, - p + ret, NULL, GFP_NOIO); + object_prefix = ceph_extract_encoded_string(&p, p + ret, NULL, + GFP_NOIO); + if (IS_ERR(object_prefix)) { + ret = PTR_ERR(object_prefix); + goto out; + } ret = 0; - if (IS_ERR(rbd_dev->header.object_prefix)) { - ret = PTR_ERR(rbd_dev->header.object_prefix); - rbd_dev->header.object_prefix = NULL; - } else { - dout(" object_prefix = %s\n", rbd_dev->header.object_prefix); - } + *pobject_prefix = object_prefix; + dout(" object_prefix = %s\n", object_prefix); out: kfree(reply_buf); @@ -5556,13 +5558,6 @@ static int _rbd_dev_v2_snap_features(struct rbd_device *rbd_dev, u64 snap_id, return 0; } -static int rbd_dev_v2_features(struct rbd_device *rbd_dev) -{ - return _rbd_dev_v2_snap_features(rbd_dev, CEPH_NOSNAP, - rbd_is_ro(rbd_dev), - &rbd_dev->header.features); -} - /* * These are generic image flags, but since they are used only for * object map, store them in rbd_dev->object_map_flags. @@ -5837,14 +5832,14 @@ static int rbd_dev_v2_parent_info(struct rbd_device *rbd_dev) return ret; } -static int rbd_dev_v2_striping_info(struct rbd_device *rbd_dev) +static int rbd_dev_v2_striping_info(struct rbd_device *rbd_dev, + u64 *stripe_unit, u64 *stripe_count) { struct { __le64 stripe_unit; __le64 stripe_count; } __attribute__ ((packed)) striping_info_buf = { 0 }; size_t size = sizeof (striping_info_buf); - void *p; int ret; ret = rbd_obj_method_sync(rbd_dev, &rbd_dev->header_oid, @@ -5856,27 +5851,33 @@ static int rbd_dev_v2_striping_info(struct rbd_device *rbd_dev) if (ret < size) return -ERANGE; - p = &striping_info_buf; - rbd_dev->header.stripe_unit = ceph_decode_64(&p); - rbd_dev->header.stripe_count = ceph_decode_64(&p); + *stripe_unit = le64_to_cpu(striping_info_buf.stripe_unit); + *stripe_count = le64_to_cpu(striping_info_buf.stripe_count); + dout(" stripe_unit = %llu stripe_count = %llu\n", *stripe_unit, + *stripe_count); + return 0; } -static int rbd_dev_v2_data_pool(struct rbd_device *rbd_dev) +static int rbd_dev_v2_data_pool(struct rbd_device *rbd_dev, s64 *data_pool_id) { - __le64 data_pool_id; + __le64 data_pool_buf; int ret; ret = rbd_obj_method_sync(rbd_dev, &rbd_dev->header_oid, &rbd_dev->header_oloc, "get_data_pool", - NULL, 0, &data_pool_id, sizeof(data_pool_id)); + NULL, 0, &data_pool_buf, + sizeof(data_pool_buf)); + dout("%s: rbd_obj_method_sync returned %d\n", __func__, ret); if (ret < 0) return ret; - if (ret < sizeof(data_pool_id)) + if (ret < sizeof(data_pool_buf)) return -EBADMSG; - rbd_dev->header.data_pool_id = le64_to_cpu(data_pool_id); - WARN_ON(rbd_dev->header.data_pool_id == CEPH_NOPOOL); + *data_pool_id = le64_to_cpu(data_pool_buf); + dout(" data_pool_id = %lld\n", *data_pool_id); + WARN_ON(*data_pool_id == CEPH_NOPOOL); + return 0; } @@ -6068,7 +6069,8 @@ static int rbd_spec_fill_names(struct rbd_device *rbd_dev) return ret; } -static int rbd_dev_v2_snap_context(struct rbd_device *rbd_dev) +static int rbd_dev_v2_snap_context(struct rbd_device *rbd_dev, + struct ceph_snap_context **psnapc) { size_t size; int ret; @@ -6129,9 +6131,7 @@ static int rbd_dev_v2_snap_context(struct rbd_device *rbd_dev) for (i = 0; i < snap_count; i++) snapc->snaps[i] = ceph_decode_64(&p); - ceph_put_snap_context(rbd_dev->header.snapc); - rbd_dev->header.snapc = snapc; - + *psnapc = snapc; dout(" snap context seq = %llu, snap_count = %u\n", (unsigned long long)seq, (unsigned int)snap_count); out: @@ -6180,38 +6180,42 @@ static const char *rbd_dev_v2_snap_name(struct rbd_device *rbd_dev, return snap_name; } -static int rbd_dev_v2_header_info(struct rbd_device *rbd_dev) +static int rbd_dev_v2_header_info(struct rbd_device *rbd_dev, + struct rbd_image_header *header, + bool first_time) { - bool first_time = rbd_dev->header.object_prefix == NULL; int ret; - ret = rbd_dev_v2_image_size(rbd_dev); + ret = _rbd_dev_v2_snap_size(rbd_dev, CEPH_NOSNAP, + first_time ? &header->obj_order : NULL, + &header->image_size); if (ret) return ret; if (first_time) { - ret = rbd_dev_v2_header_onetime(rbd_dev); + ret = rbd_dev_v2_header_onetime(rbd_dev, header); if (ret) return ret; } - ret = rbd_dev_v2_snap_context(rbd_dev); - if (ret && first_time) { - kfree(rbd_dev->header.object_prefix); - rbd_dev->header.object_prefix = NULL; - } + ret = rbd_dev_v2_snap_context(rbd_dev, &header->snapc); + if (ret) + return ret; - return ret; + return 0; } -static int rbd_dev_header_info(struct rbd_device *rbd_dev) +static int rbd_dev_header_info(struct rbd_device *rbd_dev, + struct rbd_image_header *header, + bool first_time) { rbd_assert(rbd_image_format_valid(rbd_dev->image_format)); + rbd_assert(!header->object_prefix && !header->snapc); if (rbd_dev->image_format == 1) - return rbd_dev_v1_header_info(rbd_dev); + return rbd_dev_v1_header_info(rbd_dev, header, first_time); - return rbd_dev_v2_header_info(rbd_dev); + return rbd_dev_v2_header_info(rbd_dev, header, first_time); } /* @@ -6699,60 +6703,49 @@ static int rbd_dev_image_id(struct rbd_device *rbd_dev) */ static void rbd_dev_unprobe(struct rbd_device *rbd_dev) { - struct rbd_image_header *header; - rbd_dev_parent_put(rbd_dev); rbd_object_map_free(rbd_dev); rbd_dev_mapping_clear(rbd_dev); /* Free dynamic fields from the header, then zero it out */ - header = &rbd_dev->header; - ceph_put_snap_context(header->snapc); - kfree(header->snap_sizes); - kfree(header->snap_names); - kfree(header->object_prefix); - memset(header, 0, sizeof (*header)); + rbd_image_header_cleanup(&rbd_dev->header); } -static int rbd_dev_v2_header_onetime(struct rbd_device *rbd_dev) +static int rbd_dev_v2_header_onetime(struct rbd_device *rbd_dev, + struct rbd_image_header *header) { int ret; - ret = rbd_dev_v2_object_prefix(rbd_dev); + ret = rbd_dev_v2_object_prefix(rbd_dev, &header->object_prefix); if (ret) - goto out_err; + return ret; /* * Get the and check features for the image. Currently the * features are assumed to never change. */ - ret = rbd_dev_v2_features(rbd_dev); + ret = _rbd_dev_v2_snap_features(rbd_dev, CEPH_NOSNAP, + rbd_is_ro(rbd_dev), &header->features); if (ret) - goto out_err; + return ret; /* If the image supports fancy striping, get its parameters */ - if (rbd_dev->header.features & RBD_FEATURE_STRIPINGV2) { - ret = rbd_dev_v2_striping_info(rbd_dev); - if (ret < 0) - goto out_err; + if (header->features & RBD_FEATURE_STRIPINGV2) { + ret = rbd_dev_v2_striping_info(rbd_dev, &header->stripe_unit, + &header->stripe_count); + if (ret) + return ret; } - if (rbd_dev->header.features & RBD_FEATURE_DATA_POOL) { - ret = rbd_dev_v2_data_pool(rbd_dev); + if (header->features & RBD_FEATURE_DATA_POOL) { + ret = rbd_dev_v2_data_pool(rbd_dev, &header->data_pool_id); if (ret) - goto out_err; + return ret; } - rbd_init_layout(rbd_dev); return 0; - -out_err: - rbd_dev->header.features = 0; - kfree(rbd_dev->header.object_prefix); - rbd_dev->header.object_prefix = NULL; - return ret; } /* @@ -6947,13 +6940,15 @@ static int rbd_dev_image_probe(struct rbd_device *rbd_dev, int depth) if (!depth) down_write(&rbd_dev->header_rwsem); - ret = rbd_dev_header_info(rbd_dev); + ret = rbd_dev_header_info(rbd_dev, &rbd_dev->header, true); if (ret) { if (ret == -ENOENT && !need_watch) rbd_print_dne(rbd_dev, false); goto err_out_probe; } + rbd_init_layout(rbd_dev); + /* * If this image is the one being mapped, we have pool name and * id, image name and id, and snap name - need to fill snap id. @@ -7008,15 +7003,39 @@ static int rbd_dev_image_probe(struct rbd_device *rbd_dev, int depth) return ret; } +static void rbd_dev_update_header(struct rbd_device *rbd_dev, + struct rbd_image_header *header) +{ + rbd_assert(rbd_image_format_valid(rbd_dev->image_format)); + rbd_assert(rbd_dev->header.object_prefix); /* !first_time */ + + rbd_dev->header.image_size = header->image_size; + + ceph_put_snap_context(rbd_dev->header.snapc); + rbd_dev->header.snapc = header->snapc; + header->snapc = NULL; + + if (rbd_dev->image_format == 1) { + kfree(rbd_dev->header.snap_names); + rbd_dev->header.snap_names = header->snap_names; + header->snap_names = NULL; + + kfree(rbd_dev->header.snap_sizes); + rbd_dev->header.snap_sizes = header->snap_sizes; + header->snap_sizes = NULL; + } +} + static int rbd_dev_refresh(struct rbd_device *rbd_dev) { + struct rbd_image_header header = { 0 }; u64 mapping_size; int ret; down_write(&rbd_dev->header_rwsem); mapping_size = rbd_dev->mapping.size; - ret = rbd_dev_header_info(rbd_dev); + ret = rbd_dev_header_info(rbd_dev, &header, false); if (ret) goto out; @@ -7030,6 +7049,8 @@ static int rbd_dev_refresh(struct rbd_device *rbd_dev) goto out; } + rbd_dev_update_header(rbd_dev, &header); + rbd_assert(!rbd_is_snap(rbd_dev)); rbd_dev->mapping.size = rbd_dev->header.image_size; @@ -7038,6 +7059,7 @@ static int rbd_dev_refresh(struct rbd_device *rbd_dev) if (!ret && mapping_size != rbd_dev->mapping.size) rbd_dev_update_size(rbd_dev); + rbd_image_header_cleanup(&header); return ret; } From patchwork Mon Sep 25 19:40:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ilya Dryomov X-Patchwork-Id: 13398328 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 62098CD54BE for ; Mon, 25 Sep 2023 19:41:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233280AbjIYTlI (ORCPT ); Mon, 25 Sep 2023 15:41:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39460 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233216AbjIYTlH (ORCPT ); Mon, 25 Sep 2023 15:41:07 -0400 Received: from mail-ed1-x52a.google.com (mail-ed1-x52a.google.com [IPv6:2a00:1450:4864:20::52a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C0133101 for ; Mon, 25 Sep 2023 12:40:59 -0700 (PDT) Received: by mail-ed1-x52a.google.com with SMTP id 4fb4d7f45d1cf-532c81b9adbso8725364a12.1 for ; Mon, 25 Sep 2023 12:40:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1695670858; x=1696275658; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=rBmNJe5laBTCH6NjBjIjwBOXFQdiiIHdy0on/tK1+tE=; b=RAU7HwY+zrixs8yqIF72IVDZ0nqP2+hp3GeUPXChAPP4cr8JUhFy1yC0JCcTFqO5zN pQQICLymqSSkAvlWd3SNqwRDCcoIh3xI1Z/icJ/kkzKOlIUUO7YXcqWoDrQygR/Gb2qq ySUSUxfxv/2NSTcJBwJPWqY3dPXiRnK23afz/quUIBuKhk8CD5m5lp9Ps7GqGluZxkxK X2EuSsQ4DyoPiKu6A/eWUdYft2ARP5+nm/W54dBMFOdoGndeTs43UJz2JnpG5wxGXUv6 p7o5NYvkP1883urUstVHLjMbhQcFl2KPmGaw/x3LH04hU2IbX9UMBQSOfDDHd1+I7bwW 2fsg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695670858; x=1696275658; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=rBmNJe5laBTCH6NjBjIjwBOXFQdiiIHdy0on/tK1+tE=; b=FNgl4VkEZFrKlqv81oT3lH/2Y0uzTjx4uo2CdgYfEKO6cts0prwd9UcXH0WqKjy+et iUuGQUIljIHyL68YXaxzmM5n2YgOKZoXyRkiUQKZYmt+O4wUrgjEljfKnM5OqHlwTfF/ UNoI2hVw1hfkThIWVRmcSilTosLgGs4N62mSlsFFA9gJlDTZlHnvjA+F/96oQw2GW0yI t7D8PvnuD/X0j/IXVmP+6XSjVmjfN55g8482T2MVCiMqA4Im7p4rG4QhIIG2K2jeMeGa oYzktFRwJLdYcK+jW+lp0nvc03vEHA3WxLPxBs//w9u/Wx1zlSTeXrEmSustYKTn5BWU YOgA== X-Gm-Message-State: AOJu0Yyiq7FPUTVp9tlb9qf2eASx+ATXwISCqofWgOqiJch8GZJP527e 9oTn/0rj+3FhMlrlmme0Fioy/opIddE= X-Google-Smtp-Source: AGHT+IF6I+IVczf2BMDfePbe7rRWbJoRvzcmjejz1yFohxnFeBN3YPYsqdx7DLnQqUid5331872JwA== X-Received: by 2002:a05:6402:231a:b0:533:c77b:2f3e with SMTP id l26-20020a056402231a00b00533c77b2f3emr5399393eda.21.1695670858025; Mon, 25 Sep 2023 12:40:58 -0700 (PDT) Received: from localhost.localdomain (ip-94-112-167-15.bb.vodafone.cz. [94.112.167.15]) by smtp.gmail.com with ESMTPSA id en13-20020a056402528d00b005340d9d042bsm1762365edb.40.2023.09.25.12.40.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Sep 2023 12:40:57 -0700 (PDT) From: Ilya Dryomov To: ceph-devel@vger.kernel.org Cc: Dongsheng Yang Subject: [PATCH 3/4] rbd: decouple parent info read-in from updating rbd_dev Date: Mon, 25 Sep 2023 21:40:33 +0200 Message-ID: <20230925194036.197899-4-idryomov@gmail.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230925194036.197899-1-idryomov@gmail.com> References: <20230925194036.197899-1-idryomov@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Unlike header read-in, parent info read-in is already decoupled in get_parent_info(), but it's buried in rbd_dev_v2_parent_info() along with the processing logic. Separate the initial read-in and update read-in logic into rbd_dev_setup_parent() and rbd_dev_update_parent() respectively and have rbd_dev_v2_parent_info() just populate struct parent_image_info (i.e. what get_parent_info() did). Some existing QoI issues, like flatten of a standalone clone being disregarded on refresh, remain. Signed-off-by: Ilya Dryomov --- drivers/block/rbd.c | 142 +++++++++++++++++++++++++------------------- 1 file changed, 80 insertions(+), 62 deletions(-) diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index 6ed5520ef303..d62a0298c890 100644 --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -5594,6 +5594,14 @@ struct parent_image_info { u64 overlap; }; +static void rbd_parent_info_cleanup(struct parent_image_info *pii) +{ + kfree(pii->pool_ns); + kfree(pii->image_id); + + memset(pii, 0, sizeof(*pii)); +} + /* * The caller is responsible for @pii. */ @@ -5663,6 +5671,9 @@ static int __get_parent_info(struct rbd_device *rbd_dev, if (pii->has_overlap) ceph_decode_64_safe(&p, end, pii->overlap, e_inval); + dout("%s pool_id %llu pool_ns %s image_id %s snap_id %llu has_overlap %d overlap %llu\n", + __func__, pii->pool_id, pii->pool_ns, pii->image_id, pii->snap_id, + pii->has_overlap, pii->overlap); return 0; e_inval: @@ -5701,14 +5712,17 @@ static int __get_parent_info_legacy(struct rbd_device *rbd_dev, pii->has_overlap = true; ceph_decode_64_safe(&p, end, pii->overlap, e_inval); + dout("%s pool_id %llu pool_ns %s image_id %s snap_id %llu has_overlap %d overlap %llu\n", + __func__, pii->pool_id, pii->pool_ns, pii->image_id, pii->snap_id, + pii->has_overlap, pii->overlap); return 0; e_inval: return -EINVAL; } -static int get_parent_info(struct rbd_device *rbd_dev, - struct parent_image_info *pii) +static int rbd_dev_v2_parent_info(struct rbd_device *rbd_dev, + struct parent_image_info *pii) { struct page *req_page, *reply_page; void *p; @@ -5736,7 +5750,7 @@ static int get_parent_info(struct rbd_device *rbd_dev, return ret; } -static int rbd_dev_v2_parent_info(struct rbd_device *rbd_dev) +static int rbd_dev_setup_parent(struct rbd_device *rbd_dev) { struct rbd_spec *parent_spec; struct parent_image_info pii = { 0 }; @@ -5746,37 +5760,12 @@ static int rbd_dev_v2_parent_info(struct rbd_device *rbd_dev) if (!parent_spec) return -ENOMEM; - ret = get_parent_info(rbd_dev, &pii); + ret = rbd_dev_v2_parent_info(rbd_dev, &pii); if (ret) goto out_err; - dout("%s pool_id %llu pool_ns %s image_id %s snap_id %llu has_overlap %d overlap %llu\n", - __func__, pii.pool_id, pii.pool_ns, pii.image_id, pii.snap_id, - pii.has_overlap, pii.overlap); - - if (pii.pool_id == CEPH_NOPOOL || !pii.has_overlap) { - /* - * Either the parent never existed, or we have - * record of it but the image got flattened so it no - * longer has a parent. When the parent of a - * layered image disappears we immediately set the - * overlap to 0. The effect of this is that all new - * requests will be treated as if the image had no - * parent. - * - * If !pii.has_overlap, the parent image spec is not - * applicable. It's there to avoid duplication in each - * snapshot record. - */ - if (rbd_dev->parent_overlap) { - rbd_dev->parent_overlap = 0; - rbd_dev_parent_put(rbd_dev); - pr_info("%s: clone image has been flattened\n", - rbd_dev->disk->disk_name); - } - + if (pii.pool_id == CEPH_NOPOOL || !pii.has_overlap) goto out; /* No parent? No problem. */ - } /* The ceph file layout needs to fit pool id in 32 bits */ @@ -5788,46 +5777,34 @@ static int rbd_dev_v2_parent_info(struct rbd_device *rbd_dev) } /* - * The parent won't change (except when the clone is - * flattened, already handled that). So we only need to - * record the parent spec we have not already done so. + * The parent won't change except when the clone is flattened, + * so we only need to record the parent image spec once. */ - if (!rbd_dev->parent_spec) { - parent_spec->pool_id = pii.pool_id; - if (pii.pool_ns && *pii.pool_ns) { - parent_spec->pool_ns = pii.pool_ns; - pii.pool_ns = NULL; - } - parent_spec->image_id = pii.image_id; - pii.image_id = NULL; - parent_spec->snap_id = pii.snap_id; - - rbd_dev->parent_spec = parent_spec; - parent_spec = NULL; /* rbd_dev now owns this */ + parent_spec->pool_id = pii.pool_id; + if (pii.pool_ns && *pii.pool_ns) { + parent_spec->pool_ns = pii.pool_ns; + pii.pool_ns = NULL; } + parent_spec->image_id = pii.image_id; + pii.image_id = NULL; + parent_spec->snap_id = pii.snap_id; + + rbd_assert(!rbd_dev->parent_spec); + rbd_dev->parent_spec = parent_spec; + parent_spec = NULL; /* rbd_dev now owns this */ /* - * We always update the parent overlap. If it's zero we issue - * a warning, as we will proceed as if there was no parent. + * Record the parent overlap. If it's zero, issue a warning as + * we will proceed as if there is no parent. */ - if (!pii.overlap) { - if (parent_spec) { - /* refresh, careful to warn just once */ - if (rbd_dev->parent_overlap) - rbd_warn(rbd_dev, - "clone now standalone (overlap became 0)"); - } else { - /* initial probe */ - rbd_warn(rbd_dev, "clone is standalone (overlap 0)"); - } - } + if (!pii.overlap) + rbd_warn(rbd_dev, "clone is standalone (overlap 0)"); rbd_dev->parent_overlap = pii.overlap; out: ret = 0; out_err: - kfree(pii.pool_ns); - kfree(pii.image_id); + rbd_parent_info_cleanup(&pii); rbd_spec_put(parent_spec); return ret; } @@ -6977,7 +6954,7 @@ static int rbd_dev_image_probe(struct rbd_device *rbd_dev, int depth) } if (rbd_dev->header.features & RBD_FEATURE_LAYERING) { - ret = rbd_dev_v2_parent_info(rbd_dev); + ret = rbd_dev_setup_parent(rbd_dev); if (ret) goto err_out_probe; } @@ -7026,9 +7003,47 @@ static void rbd_dev_update_header(struct rbd_device *rbd_dev, } } +static void rbd_dev_update_parent(struct rbd_device *rbd_dev, + struct parent_image_info *pii) +{ + if (pii->pool_id == CEPH_NOPOOL || !pii->has_overlap) { + /* + * Either the parent never existed, or we have + * record of it but the image got flattened so it no + * longer has a parent. When the parent of a + * layered image disappears we immediately set the + * overlap to 0. The effect of this is that all new + * requests will be treated as if the image had no + * parent. + * + * If !pii.has_overlap, the parent image spec is not + * applicable. It's there to avoid duplication in each + * snapshot record. + */ + if (rbd_dev->parent_overlap) { + rbd_dev->parent_overlap = 0; + rbd_dev_parent_put(rbd_dev); + pr_info("%s: clone has been flattened\n", + rbd_dev->disk->disk_name); + } + } else { + rbd_assert(rbd_dev->parent_spec); + + /* + * Update the parent overlap. If it became zero, issue + * a warning as we will proceed as if there is no parent. + */ + if (!pii->overlap && rbd_dev->parent_overlap) + rbd_warn(rbd_dev, + "clone has become standalone (overlap 0)"); + rbd_dev->parent_overlap = pii->overlap; + } +} + static int rbd_dev_refresh(struct rbd_device *rbd_dev) { struct rbd_image_header header = { 0 }; + struct parent_image_info pii = { 0 }; u64 mapping_size; int ret; @@ -7044,12 +7059,14 @@ static int rbd_dev_refresh(struct rbd_device *rbd_dev) * mapped image getting flattened. */ if (rbd_dev->parent) { - ret = rbd_dev_v2_parent_info(rbd_dev); + ret = rbd_dev_v2_parent_info(rbd_dev, &pii); if (ret) goto out; } rbd_dev_update_header(rbd_dev, &header); + if (rbd_dev->parent) + rbd_dev_update_parent(rbd_dev, &pii); rbd_assert(!rbd_is_snap(rbd_dev)); rbd_dev->mapping.size = rbd_dev->header.image_size; @@ -7059,6 +7076,7 @@ static int rbd_dev_refresh(struct rbd_device *rbd_dev) if (!ret && mapping_size != rbd_dev->mapping.size) rbd_dev_update_size(rbd_dev); + rbd_parent_info_cleanup(&pii); rbd_image_header_cleanup(&header); return ret; } From patchwork Mon Sep 25 19:40:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ilya Dryomov X-Patchwork-Id: 13398329 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DBC49CD54BF for ; Mon, 25 Sep 2023 19:41:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233308AbjIYTlJ (ORCPT ); Mon, 25 Sep 2023 15:41:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39466 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233259AbjIYTlI (ORCPT ); Mon, 25 Sep 2023 15:41:08 -0400 Received: from mail-ed1-x52b.google.com (mail-ed1-x52b.google.com [IPv6:2a00:1450:4864:20::52b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B731C10D for ; Mon, 25 Sep 2023 12:41:00 -0700 (PDT) Received: by mail-ed1-x52b.google.com with SMTP id 4fb4d7f45d1cf-533d6a8d6b6so4766314a12.2 for ; Mon, 25 Sep 2023 12:41:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1695670859; x=1696275659; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=RgrC3QJ03WLcpEEC52C3Ui4nfeLUBQjD+WEIjcVBxVc=; b=Wcq7J5389C3USxH+gKngVz6SMno0BnFbojuErME4sWFOdvQtIaw9rOsSD6It/eN+FL CBP8CaZVd3S7933SbawYuRJeUDI1pvAa4UyH6yk6PBANfE9xLuQt6JTrK6zZRwOnbama YT0Op8TvHiLbvcU+NniveiB79o46z5sq9BfO2h8kEc7Z3Loo7mHUKJGQpF4KC+oCMAtr d3zIpTz5U5ETint3L70mv5C6BQzXvnX+qaGhk+EKGTP0uliX/o3p7IQ1n+U5f2hl8due Y4iThpOZVWn7JW2XCFWu70n6FW7ZDkHpNx8BuD0wj1bmot6FD7UnXdb2uspXw7xgQdpe frqQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695670859; x=1696275659; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=RgrC3QJ03WLcpEEC52C3Ui4nfeLUBQjD+WEIjcVBxVc=; b=dnH6flvibf9hZWUPpwQ/jQinja+iut4BGzmfOdcdP/JjYmrhUPdZptGAvp+PtHhFxd gPi9QBbkdqqnaJ1HlQbre2/AYKyi2hIdjZUadTPutpDztqENw0ytjG4zTZrxz7KZ5yEE WBVkKugRFwRaK+O4veACDypOOkhHhTA9Q1lJh8nCesS+CNgG0uJBD6+26pRd18ap9j9y r9lAoOxuNV9CYpG4jR1HPYYjmW+OOoF3TIy/vzF8tey0HT7jkAePndRjbRaDABILa0B6 FuNaKezeCm3rcB8r6jmxRsM6Ggl6DqwidOGz5Tze7+W1zE+XA2fsSh8pbTeN4rn7EDEe s7tg== X-Gm-Message-State: AOJu0YzFlBLEVGMnM1ReU4L2VIGRXPQdWUpwoHE1PQ2As4stDl5kOK8F hbAS55A7pzeYn9Fvt43Gaa7eHARxCXU= X-Google-Smtp-Source: AGHT+IF5ueRvJm3nZVn0Eg2GZ2Pvq54bP7A7GNDo0PVGHsh1ajIvurqzM4zAyYswPsmO4P8TtooEYg== X-Received: by 2002:aa7:c741:0:b0:533:5c03:5fce with SMTP id c1-20020aa7c741000000b005335c035fcemr6685405eds.5.1695670859158; Mon, 25 Sep 2023 12:40:59 -0700 (PDT) Received: from localhost.localdomain (ip-94-112-167-15.bb.vodafone.cz. [94.112.167.15]) by smtp.gmail.com with ESMTPSA id en13-20020a056402528d00b005340d9d042bsm1762365edb.40.2023.09.25.12.40.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Sep 2023 12:40:58 -0700 (PDT) From: Ilya Dryomov To: ceph-devel@vger.kernel.org Cc: Dongsheng Yang Subject: [PATCH 4/4] rbd: take header_rwsem in rbd_dev_refresh() only when updating Date: Mon, 25 Sep 2023 21:40:34 +0200 Message-ID: <20230925194036.197899-5-idryomov@gmail.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230925194036.197899-1-idryomov@gmail.com> References: <20230925194036.197899-1-idryomov@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org rbd_dev_refresh() has been holding header_rwsem across header and parent info read-in unnecessarily for ages. With commit 870611e4877e ("rbd: get snapshot context after exclusive lock is ensured to be held"), the potential for deadlocks became much more real owning to a) header_rwsem now nesting inside lock_rwsem and b) rw_semaphores not allowing new readers after a writer is registered. For example, assuming that I/O request 1, I/O request 2 and header read-in request all target the same OSD: 1. I/O request 1 comes in and gets submitted 2. watch error occurs 3. rbd_watch_errcb() takes lock_rwsem for write, clears owner_cid and releases lock_rwsem 4. after reestablishing the watch, rbd_reregister_watch() calls rbd_dev_refresh() which takes header_rwsem for write and submits a header read-in request 5. I/O request 2 comes in: after taking lock_rwsem for read in __rbd_img_handle_request(), it blocks trying to take header_rwsem for read in rbd_img_object_requests() 6. another watch error occurs 7. rbd_watch_errcb() blocks trying to take lock_rwsem for write 8. I/O request 1 completion is received by the messenger but can't be processed because lock_rwsem won't be granted anymore 9. header read-in request completion can't be received, let alone processed, because the messenger is stranded Change rbd_dev_refresh() to take header_rwsem only for actually updating rbd_dev->header. Header and parent info read-in don't need any locking. Cc: stable@vger.kernel.org # e3580eaee090: rbd: move rbd_dev_refresh() definition Cc: stable@vger.kernel.org # 641d828d82d0: rbd: decouple header read-in from updating rbd_dev->header Cc: stable@vger.kernel.org # 2147c0b31b95: rbd: decouple parent info read-in from updating rbd_dev Cc: stable@vger.kernel.org Fixes: 870611e4877e ("rbd: get snapshot context after exclusive lock is ensured to be held") Signed-off-by: Ilya Dryomov Reviewed-by: Dongsheng Yang --- drivers/block/rbd.c | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index d62a0298c890..a999b698b131 100644 --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -6986,7 +6986,14 @@ static void rbd_dev_update_header(struct rbd_device *rbd_dev, rbd_assert(rbd_image_format_valid(rbd_dev->image_format)); rbd_assert(rbd_dev->header.object_prefix); /* !first_time */ - rbd_dev->header.image_size = header->image_size; + if (rbd_dev->header.image_size != header->image_size) { + rbd_dev->header.image_size = header->image_size; + + if (!rbd_is_snap(rbd_dev)) { + rbd_dev->mapping.size = header->image_size; + rbd_dev_update_size(rbd_dev); + } + } ceph_put_snap_context(rbd_dev->header.snapc); rbd_dev->header.snapc = header->snapc; @@ -7044,11 +7051,9 @@ static int rbd_dev_refresh(struct rbd_device *rbd_dev) { struct rbd_image_header header = { 0 }; struct parent_image_info pii = { 0 }; - u64 mapping_size; int ret; - down_write(&rbd_dev->header_rwsem); - mapping_size = rbd_dev->mapping.size; + dout("%s rbd_dev %p\n", __func__, rbd_dev); ret = rbd_dev_header_info(rbd_dev, &header, false); if (ret) @@ -7064,18 +7069,13 @@ static int rbd_dev_refresh(struct rbd_device *rbd_dev) goto out; } + down_write(&rbd_dev->header_rwsem); rbd_dev_update_header(rbd_dev, &header); if (rbd_dev->parent) rbd_dev_update_parent(rbd_dev, &pii); - - rbd_assert(!rbd_is_snap(rbd_dev)); - rbd_dev->mapping.size = rbd_dev->header.image_size; - -out: up_write(&rbd_dev->header_rwsem); - if (!ret && mapping_size != rbd_dev->mapping.size) - rbd_dev_update_size(rbd_dev); +out: rbd_parent_info_cleanup(&pii); rbd_image_header_cleanup(&header); return ret;