From patchwork Mon Feb 8 08:52:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jingbo Xu X-Patchwork-Id: 12091809 X-Patchwork-Delegate: snitzer@redhat.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,UNPARSEABLE_RELAY autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4D903C433DB for ; Wed, 17 Feb 2021 15:37:12 +0000 (UTC) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C04B364D9A for ; Wed, 17 Feb 2021 15:37:11 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C04B364D9A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: mail.kernel.org; spf=tempfail smtp.mailfrom=dm-devel-bounces@redhat.com Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-136-4X7t2SZdM8WjuyXAWbjZcg-1; Wed, 17 Feb 2021 10:37:08 -0500 X-MC-Unique: 4X7t2SZdM8WjuyXAWbjZcg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 8FF2D100CCC2; Wed, 17 Feb 2021 15:37:02 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 6F1B05D9D2; Wed, 17 Feb 2021 15:37:02 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id 199BD58075; Wed, 17 Feb 2021 15:37:02 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id 1188r3Yb030074 for ; Mon, 8 Feb 2021 03:53:03 -0500 Received: by smtp.corp.redhat.com (Postfix) id E23AF200E1EE; Mon, 8 Feb 2021 08:53:02 +0000 (UTC) Received: from mimecast-mx02.redhat.com (mimecast01.extmail.prod.ext.rdu2.redhat.com [10.11.55.17]) by smtp.corp.redhat.com (Postfix) with ESMTPS id DD8042026D46 for ; Mon, 8 Feb 2021 08:53:02 +0000 (UTC) Received: from us-smtp-1.mimecast.com (us-smtp-delivery-1.mimecast.com [205.139.110.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id C77F8858F17 for ; Mon, 8 Feb 2021 08:53:02 +0000 (UTC) Received: from out30-57.freemail.mail.aliyun.com (out30-57.freemail.mail.aliyun.com [115.124.30.57]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-128-7U57mI0ENqqNIf78QsUFRw-1; Mon, 08 Feb 2021 03:52:58 -0500 X-MC-Unique: 7U57mI0ENqqNIf78QsUFRw-1 X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R111e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=e01e04423; MF=jefflexu@linux.alibaba.com; NM=1; PH=DS; RN=8; SR=0; TI=SMTPD_---0UO9soA-_1612774372 Received: from localhost(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0UO9soA-_1612774372) by smtp.aliyun-inc.com(127.0.0.1); Mon, 08 Feb 2021 16:52:52 +0800 From: Jeffle Xu To: snitzer@redhat.com, axboe@kernel.dk Date: Mon, 8 Feb 2021 16:52:40 +0800 Message-Id: <20210208085243.82367-9-jefflexu@linux.alibaba.com> In-Reply-To: <20210208085243.82367-1-jefflexu@linux.alibaba.com> References: <20210208085243.82367-1-jefflexu@linux.alibaba.com> MIME-Version: 1.0 X-Mimecast-Impersonation-Protect: Policy=CLT - Impersonation Protection Definition; Similar Internal Domain=false; Similar Monitored External Domain=false; Custom External Domain=false; Mimecast External Domain=false; Newly Observed Domain=false; Internal User Name=false; Custom Display Name List=false; Reply-to Address Mismatch=false; Targeted Threat Dictionary=false; Mimecast Threat Dictionary=false; Custom Threat Dictionary=false X-Scanned-By: MIMEDefang 2.78 on 10.11.54.4 X-loop: dm-devel@redhat.com X-Mailman-Approved-At: Wed, 17 Feb 2021 10:31:26 -0500 Cc: caspar@linux.alibaba.com, hch@lst.de, linux-block@vger.kernel.org, joseph.qi@linux.alibaba.com, dm-devel@redhat.com, io-uring@vger.kernel.org Subject: [dm-devel] [PATCH v3 08/11] dm: fix iterate_device sanity check X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=dm-devel-bounces@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com According to the definition of dm_iterate_devices_fn: * This function must iterate through each section of device used by the * target until it encounters a non-zero return code, which it then returns. * Returns zero if no callout returned non-zero. For some target type (e.g., dm-stripe), one call of iterate_devices() may iterate multiple underlying devices internally, in which case a non-zero return code returned by iterate_devices_callout_fn will stop the iteration in advance. Thus if we want to ensure that _all_ underlying devices support some kind of attribute, the iteration structure like dm_table_supports_nowait() should be used, while the input iterate_devices_callout_fn should handle the 'not support' semantics. On the opposite, the iteration structure like dm_table_any_dev_attr() should be used if _any_ underlying device supporting this attibute is sufficient. In this case, the input iterate_devices_callout_fn should handle the 'support' semantics. Besides, remove the unnecessary NULL pointer check for request_queue, since the request_queue pointer returned from bdev_get_queue() shall never be NULL according to commit ff9ea323816d ("block, bdi: an active gendisk always has a request_queue associated with it"). Fixes: 545ed20e6df6 ("dm: add infrastructure for DAX support") Fixes: c3c4555edd10 ("dm table: clear add_random unless all devices have it set") Fixes: 4693c9668fdc ("dm table: propagate non rotational flag") Cc: stable@vger.kernel.org Signed-off-by: Jeffle Xu --- drivers/md/dm-table.c | 195 +++++++++++++++++++----------------------- drivers/md/dm.c | 2 +- drivers/md/dm.h | 2 +- 3 files changed, 91 insertions(+), 108 deletions(-) diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c index 4acf2342f7ad..aa37f3e82238 100644 --- a/drivers/md/dm-table.c +++ b/drivers/md/dm-table.c @@ -820,24 +820,24 @@ void dm_table_set_type(struct dm_table *t, enum dm_queue_mode type) EXPORT_SYMBOL_GPL(dm_table_set_type); /* validate the dax capability of the target device span */ -int device_supports_dax(struct dm_target *ti, struct dm_dev *dev, - sector_t start, sector_t len, void *data) +int device_not_dax_capable(struct dm_target *ti, struct dm_dev *dev, + sector_t start, sector_t len, void *data) { int blocksize = *(int *) data, id; bool rc; id = dax_read_lock(); - rc = dax_supported(dev->dax_dev, dev->bdev, blocksize, start, len); + rc = !dax_supported(dev->dax_dev, dev->bdev, blocksize, start, len); dax_read_unlock(id); return rc; } /* Check devices support synchronous DAX */ -static int device_dax_synchronous(struct dm_target *ti, struct dm_dev *dev, - sector_t start, sector_t len, void *data) +static int device_not_dax_sync_capable(struct dm_target *ti, struct dm_dev *dev, + sector_t start, sector_t len, void *data) { - return dev->dax_dev && dax_synchronous(dev->dax_dev); + return !dev->dax_dev || !dax_synchronous(dev->dax_dev); } bool dm_table_supports_dax(struct dm_table *t, @@ -854,7 +854,7 @@ bool dm_table_supports_dax(struct dm_table *t, return false; if (!ti->type->iterate_devices || - !ti->type->iterate_devices(ti, iterate_fn, blocksize)) + ti->type->iterate_devices(ti, iterate_fn, blocksize)) return false; } @@ -925,7 +925,7 @@ static int dm_table_determine_type(struct dm_table *t) verify_bio_based: /* We must use this table as bio-based */ t->type = DM_TYPE_BIO_BASED; - if (dm_table_supports_dax(t, device_supports_dax, &page_size) || + if (dm_table_supports_dax(t, device_not_dax_capable, &page_size) || (list_empty(devices) && live_md_type == DM_TYPE_DAX_BIO_BASED)) { t->type = DM_TYPE_DAX_BIO_BASED; } @@ -1331,13 +1331,49 @@ bool dm_table_has_no_data_devices(struct dm_table *table) return true; } -static int device_is_zoned_model(struct dm_target *ti, struct dm_dev *dev, - sector_t start, sector_t len, void *data) +/* + * Cases requiring _any_ underlying device supporting some kind of attribute, + * should use the iteration structure like dm_table_any_dev_attr(), or call + * it directly. @func should handle semantics of positive examples, e.g., + * capable of something. + */ +static bool dm_table_any_dev_attr(struct dm_table *t, + iterate_devices_callout_fn func, void *data) +{ + struct dm_target *ti; + unsigned int i; + + for (i = 0; i < dm_table_get_num_targets(t); i++) { + ti = dm_table_get_target(t, i); + + if (ti->type->iterate_devices && + ti->type->iterate_devices(ti, func, data)) + return true; + } + + return false; +} + +/* + * Cases requiring _all_ underlying devices supporting some kind of attribute, + * should use the iteration structure like dm_table_supports_nowait(), or call + * dm_table_all_devs_attr() directly. @anti_func should handle semantics of + * counter examples, e.g., not capable of something. + */ +static inline bool dm_table_all_devs_attr(struct dm_table *t, + iterate_devices_callout_fn anti_func, + void *data) +{ + return !dm_table_any_dev_attr(t, anti_func, data); +} + +static int device_not_zoned_model(struct dm_target *ti, struct dm_dev *dev, + sector_t start, sector_t len, void *data) { struct request_queue *q = bdev_get_queue(dev->bdev); enum blk_zoned_model *zoned_model = data; - return q && blk_queue_zoned_model(q) == *zoned_model; + return blk_queue_zoned_model(q) != *zoned_model; } static bool dm_table_supports_zoned_model(struct dm_table *t, @@ -1354,37 +1390,20 @@ static bool dm_table_supports_zoned_model(struct dm_table *t, return false; if (!ti->type->iterate_devices || - !ti->type->iterate_devices(ti, device_is_zoned_model, &zoned_model)) + ti->type->iterate_devices(ti, device_not_zoned_model, &zoned_model)) return false; } return true; } -static int device_matches_zone_sectors(struct dm_target *ti, struct dm_dev *dev, - sector_t start, sector_t len, void *data) +static int device_not_matches_zone_sectors(struct dm_target *ti, struct dm_dev *dev, + sector_t start, sector_t len, void *data) { struct request_queue *q = bdev_get_queue(dev->bdev); unsigned int *zone_sectors = data; - return q && blk_queue_zone_sectors(q) == *zone_sectors; -} - -static bool dm_table_matches_zone_sectors(struct dm_table *t, - unsigned int zone_sectors) -{ - struct dm_target *ti; - unsigned i; - - for (i = 0; i < dm_table_get_num_targets(t); i++) { - ti = dm_table_get_target(t, i); - - if (!ti->type->iterate_devices || - !ti->type->iterate_devices(ti, device_matches_zone_sectors, &zone_sectors)) - return false; - } - - return true; + return blk_queue_zone_sectors(q) != *zone_sectors; } static int validate_hardware_zoned_model(struct dm_table *table, @@ -1404,7 +1423,7 @@ static int validate_hardware_zoned_model(struct dm_table *table, if (!zone_sectors || !is_power_of_2(zone_sectors)) return -EINVAL; - if (!dm_table_matches_zone_sectors(table, zone_sectors)) { + if (dm_table_any_dev_attr(table, device_not_matches_zone_sectors, &zone_sectors)) { DMERR("%s: zone sectors is not consistent across all devices", dm_device_name(table->md)); return -EINVAL; @@ -1533,7 +1552,7 @@ static int device_flush_capable(struct dm_target *ti, struct dm_dev *dev, unsigned long flush = (unsigned long) data; struct request_queue *q = bdev_get_queue(dev->bdev); - return q && (q->queue_flags & flush); + return (q->queue_flags & flush); } static bool dm_table_supports_flush(struct dm_table *t, unsigned long flush) @@ -1578,54 +1597,20 @@ static int device_dax_write_cache_enabled(struct dm_target *ti, return false; } -static int dm_table_supports_dax_write_cache(struct dm_table *t) -{ - struct dm_target *ti; - unsigned i; - - for (i = 0; i < dm_table_get_num_targets(t); i++) { - ti = dm_table_get_target(t, i); - - if (ti->type->iterate_devices && - ti->type->iterate_devices(ti, - device_dax_write_cache_enabled, NULL)) - return true; - } - - return false; -} - -static int device_is_nonrot(struct dm_target *ti, struct dm_dev *dev, - sector_t start, sector_t len, void *data) +static int device_is_rot(struct dm_target *ti, struct dm_dev *dev, + sector_t start, sector_t len, void *data) { struct request_queue *q = bdev_get_queue(dev->bdev); - return q && blk_queue_nonrot(q); + return !blk_queue_nonrot(q); } static int device_is_not_random(struct dm_target *ti, struct dm_dev *dev, - sector_t start, sector_t len, void *data) + sector_t start, sector_t len, void *data) { struct request_queue *q = bdev_get_queue(dev->bdev); - return q && !blk_queue_add_random(q); -} - -static bool dm_table_all_devices_attribute(struct dm_table *t, - iterate_devices_callout_fn func) -{ - struct dm_target *ti; - unsigned i; - - for (i = 0; i < dm_table_get_num_targets(t); i++) { - ti = dm_table_get_target(t, i); - - if (!ti->type->iterate_devices || - !ti->type->iterate_devices(ti, func, NULL)) - return false; - } - - return true; + return !blk_queue_add_random(q); } static int device_not_write_same_capable(struct dm_target *ti, struct dm_dev *dev, @@ -1633,7 +1618,7 @@ static int device_not_write_same_capable(struct dm_target *ti, struct dm_dev *de { struct request_queue *q = bdev_get_queue(dev->bdev); - return q && !q->limits.max_write_same_sectors; + return !q->limits.max_write_same_sectors; } static bool dm_table_supports_write_same(struct dm_table *t) @@ -1660,7 +1645,7 @@ static int device_not_write_zeroes_capable(struct dm_target *ti, struct dm_dev * { struct request_queue *q = bdev_get_queue(dev->bdev); - return q && !q->limits.max_write_zeroes_sectors; + return !q->limits.max_write_zeroes_sectors; } static bool dm_table_supports_write_zeroes(struct dm_table *t) @@ -1687,7 +1672,7 @@ static int device_not_nowait_capable(struct dm_target *ti, struct dm_dev *dev, { struct request_queue *q = bdev_get_queue(dev->bdev); - return q && !blk_queue_nowait(q); + return !blk_queue_nowait(q); } static bool dm_table_supports_nowait(struct dm_table *t) @@ -1714,7 +1699,7 @@ static int device_not_discard_capable(struct dm_target *ti, struct dm_dev *dev, { struct request_queue *q = bdev_get_queue(dev->bdev); - return q && !blk_queue_discard(q); + return !blk_queue_discard(q); } static bool dm_table_supports_discards(struct dm_table *t) @@ -1748,7 +1733,7 @@ static int device_not_secure_erase_capable(struct dm_target *ti, { struct request_queue *q = bdev_get_queue(dev->bdev); - return q && !blk_queue_secure_erase(q); + return !blk_queue_secure_erase(q); } static bool dm_table_supports_secure_erase(struct dm_table *t) @@ -1776,30 +1761,23 @@ static int device_requires_stable_pages(struct dm_target *ti, { struct request_queue *q = bdev_get_queue(dev->bdev); - return q && blk_queue_stable_writes(q); + return blk_queue_stable_writes(q); } /* - * If any underlying device requires stable pages, a table must require - * them as well. Only targets that support iterate_devices are considered: - * don't want error, zero, etc to require stable pages. + * type->iterate_devices() should be called when the sanity check needs to + * iterate and check all underlying data devices. iterate_devices() will + * iterate all underlying data devices until it encounters a non-zero return + * code, returned by whether the input iterate_devices_callout_fn, or + * iterate_devices() itself internally. + * + * For some target type (e.g., dm-stripe), one call of iterate_devices() may + * iterate multiple underlying devices internally, in which case a non-zero + * return code returned by iterate_devices_callout_fn will stop the iteration + * in advance. + * Thus users should call dm_table_any_dev_attr() or dm_table_all_devs_attr() + * directly if possible. See comments of these two functions for more details. */ -static bool dm_table_requires_stable_pages(struct dm_table *t) -{ - struct dm_target *ti; - unsigned i; - - for (i = 0; i < dm_table_get_num_targets(t); i++) { - ti = dm_table_get_target(t, i); - - if (ti->type->iterate_devices && - ti->type->iterate_devices(ti, device_requires_stable_pages, NULL)) - return true; - } - - return false; -} - void dm_table_set_restrictions(struct dm_table *t, struct request_queue *q, struct queue_limits *limits) { @@ -1837,22 +1815,22 @@ void dm_table_set_restrictions(struct dm_table *t, struct request_queue *q, } blk_queue_write_cache(q, wc, fua); - if (dm_table_supports_dax(t, device_supports_dax, &page_size)) { + if (dm_table_supports_dax(t, device_not_dax_capable, &page_size)) { blk_queue_flag_set(QUEUE_FLAG_DAX, q); - if (dm_table_supports_dax(t, device_dax_synchronous, NULL)) + if (dm_table_supports_dax(t, device_not_dax_sync_capable, NULL)) set_dax_synchronous(t->md->dax_dev); } else blk_queue_flag_clear(QUEUE_FLAG_DAX, q); - if (dm_table_supports_dax_write_cache(t)) + if (dm_table_any_dev_attr(t, device_dax_write_cache_enabled, NULL)) dax_write_cache(t->md->dax_dev, true); /* Ensure that all underlying devices are non-rotational. */ - if (dm_table_all_devices_attribute(t, device_is_nonrot)) - blk_queue_flag_set(QUEUE_FLAG_NONROT, q); - else + if (dm_table_any_dev_attr(t, device_is_rot, NULL)) blk_queue_flag_clear(QUEUE_FLAG_NONROT, q); + else + blk_queue_flag_set(QUEUE_FLAG_NONROT, q); if (!dm_table_supports_write_same(t)) q->limits.max_write_same_sectors = 0; @@ -1864,8 +1842,11 @@ void dm_table_set_restrictions(struct dm_table *t, struct request_queue *q, /* * Some devices don't use blk_integrity but still want stable pages * because they do their own checksumming. + * If any underlying device requires stable pages, a table must require + * them as well. Only targets that support iterate_devices are considered: + * don't want error, zero, etc to require stable pages. */ - if (dm_table_requires_stable_pages(t)) + if (dm_table_any_dev_attr(t, device_requires_stable_pages, NULL)) blk_queue_flag_set(QUEUE_FLAG_STABLE_WRITES, q); else blk_queue_flag_clear(QUEUE_FLAG_STABLE_WRITES, q); @@ -1876,8 +1857,10 @@ void dm_table_set_restrictions(struct dm_table *t, struct request_queue *q, * Clear QUEUE_FLAG_ADD_RANDOM if any underlying device does not * have it set. */ - if (blk_queue_add_random(q) && dm_table_all_devices_attribute(t, device_is_not_random)) + if (dm_table_any_dev_attr(t, device_is_not_random, NULL)) blk_queue_flag_clear(QUEUE_FLAG_ADD_RANDOM, q); + else + blk_queue_flag_set(QUEUE_FLAG_ADD_RANDOM, q); /* * For a zoned target, the number of zones should be updated for the diff --git a/drivers/md/dm.c b/drivers/md/dm.c index 46ca3b739396..c2945c90745e 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -1128,7 +1128,7 @@ static bool dm_dax_supported(struct dax_device *dax_dev, struct block_device *bd if (!map) goto out; - ret = dm_table_supports_dax(map, device_supports_dax, &blocksize); + ret = dm_table_supports_dax(map, device_not_dax_capable, &blocksize); out: dm_put_live_table(md, srcu_idx); diff --git a/drivers/md/dm.h b/drivers/md/dm.h index fffe1e289c53..b441ad772c18 100644 --- a/drivers/md/dm.h +++ b/drivers/md/dm.h @@ -73,7 +73,7 @@ void dm_table_free_md_mempools(struct dm_table *t); struct dm_md_mempools *dm_table_get_md_mempools(struct dm_table *t); bool dm_table_supports_dax(struct dm_table *t, iterate_devices_callout_fn fn, int *blocksize); -int device_supports_dax(struct dm_target *ti, struct dm_dev *dev, +int device_not_dax_capable(struct dm_target *ti, struct dm_dev *dev, sector_t start, sector_t len, void *data); void dm_lock_md_type(struct mapped_device *md);