From patchwork Thu Jan 8 09:00:34 2009 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Martin K. Petersen" X-Patchwork-Id: 1314 X-Patchwork-Delegate: agk@redhat.com Received: from hormel.redhat.com (hormel1.redhat.com [209.132.177.33]) by demeter.kernel.org (8.14.2/8.14.2) with ESMTP id n088vMvG014123 for ; Thu, 8 Jan 2009 00:57:23 -0800 Received: from listman.util.phx.redhat.com (listman.util.phx.redhat.com [10.8.4.110]) by hormel.redhat.com (Postfix) with ESMTP id 1C4BD61A4A7; Thu, 8 Jan 2009 04:01:00 -0500 (EST) Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by listman.util.phx.redhat.com (8.13.1/8.13.1) with ESMTP id n0890v3c031295 for ; Thu, 8 Jan 2009 04:00:57 -0500 Received: from mx1.redhat.com (mx1.redhat.com [172.16.48.31]) by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id n0890vEg010683; Thu, 8 Jan 2009 04:00:57 -0500 Received: from acsinet12.oracle.com (acsinet12.oracle.com [141.146.126.234]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id n0890fkO026732; Thu, 8 Jan 2009 04:00:41 -0500 Received: from acsinet13.oracle.com (acsinet13.oracle.com [141.146.126.235]) by acsinet12.oracle.com (Switch-3.3.1/Switch-3.3.1) with ESMTP id n08909vG025521 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 8 Jan 2009 09:00:10 GMT Received: from acsmt706.oracle.com (acsmt706.oracle.com [141.146.40.84]) by acsinet13.oracle.com (Switch-3.3.1/Switch-3.3.1) with ESMTP id n0891EqB002976; Thu, 8 Jan 2009 09:01:15 GMT Received: from groovelator.mkp.net (/209.217.122.111) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Thu, 08 Jan 2009 09:00:35 +0000 To: Alasdair G Kergon Subject: Re: [dm-devel] [PATCH] DM Block integrity support From: "Martin K. Petersen" Organization: Oracle References: <20090107193123.GQ3512@agk.fab.redhat.com> Date: Thu, 08 Jan 2009 04:00:34 -0500 In-Reply-To: <20090107193123.GQ3512@agk.fab.redhat.com> (Alasdair G. Kergon's message of "Wed\, 7 Jan 2009 19\:31\:23 +0000") Message-ID: User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.2 (gnu/linux) MIME-Version: 1.0 X-Source-IP: acsmt706.oracle.com [141.146.40.84] X-Auth-Type: Internal IP X-CT-RefId: str=0001.0A090206.4965C0B5.01DD:SCFSTAT928724,ss=1,fgs=0 X-RedHat-Spam-Score: -103.928 X-Scanned-By: MIMEDefang 2.58 on 172.16.52.254 X-Scanned-By: MIMEDefang 2.63 on 172.16.48.31 X-loop: dm-devel@redhat.com Cc: device-mapper development , jens.axboe@oracle.com X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.5 Precedence: junk Reply-To: device-mapper development List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com >>>>> "Alasdair" == Alasdair G Kergon writes: Alasdair> [FIXME: Looks as if that will hit BUG_ON(template == NULL) - Alasdair> maybe there's an undocumented dependency on some other patch? Sorry, yes. That note got lost in the noise when I sliced and diced the patch set. There's a patch in Jens' tree that uses NULL to indicate the default (empty) profile. Subsequent calls to register will simply update the already allocated profile. Alasdair> [FIXME: Does this cope with integrity profile needing removing Alasdair> after a table swap?] Massaged version below. I incorporated your changes. Alasdair> [FIXME: blk_integrity_register() allocating & manipulating Alasdair> kobjects needs moving outside __bind(). - Does the Alasdair> preallocation address this now, Yep. Alasdair> with the exception of some unimportant races we'll ignore but Alasdair> should document? E.g. If integrity profile appears on an Alasdair> underlying device between the reload and the table swap. Alasdair> Would it be better for the table swap to fail in that Alasdair> particular case?] An integrity profile doesn't just appear out of the blue. It takes several hours to reformat a SCSI disk to turn it on (and the drive is offline in the meantime). So I think this is mostly in the academic interest department. I guess a runtime state transition could occur if you have a DM device on top of a mirror with one drive with integrity enabled and one without. And you then swap the legacy disk out with an integrity device in causing the RAID1 to suddenly be flagged as capable. But at least from a physical storage device perspective the integrity capability is a persistent state. There's another corner case, though. If you go from all legacy devices to all integrity ditto no profile will be registered. That's an unfortunate side-effect of doing preallocation. But also an unlikely scenario, I'd say. dm: Add support for data integrity to DM This patch provides support for data integrity passthrough in the device mapper. - If one or more component devices support integrity an integrity profile is preallocated for the DM device. - If all component devices have compatible profiles the DM device is flagged as capable. - Handle integrity metadata when splitting and cloning bios. Signed-off-by: Martin K. Petersen --- -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel diff --git a/drivers/md/dm-ioctl.c b/drivers/md/dm-ioctl.c --- a/drivers/md/dm-ioctl.c +++ b/drivers/md/dm-ioctl.c @@ -1046,6 +1046,19 @@ static int populate_table(struct dm_tabl return dm_table_complete(table); } +static int table_prealloc_integrity(struct dm_table *t, + struct mapped_device *md) +{ + struct list_head *devices = dm_table_get_devices(t); + struct dm_dev_internal *dd; + + list_for_each_entry(dd, devices, list) + if (bdev_get_integrity(dd->dm_dev.bdev)) + return blk_integrity_register(dm_disk(md), NULL); + + return 0; +} + static int table_load(struct dm_ioctl *param, size_t param_size) { int r; @@ -1064,6 +1077,14 @@ static int table_load(struct dm_ioctl *p r = populate_table(t, param, param_size); if (r) { dm_table_put(t); + goto out; + } + + r = table_prealloc_integrity(t, md); + if (r) { + DMERR("%s: could not register integrity profile.", + dm_device_name(md)); + dm_table_destroy(t); goto out; } diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c --- a/drivers/md/dm-table.c +++ b/drivers/md/dm-table.c @@ -877,6 +877,45 @@ struct dm_target *dm_table_find_target(s return &t->targets[(KEYS_PER_NODE * n) + k]; } +/* + * Set the integrity profile for this device if all devices used have + * matching profiles. + */ +static void dm_table_set_integrity(struct dm_table *t) +{ + struct list_head *devices = dm_table_get_devices(t); + struct dm_dev_internal *prev = NULL, *dd = NULL; + + if (!blk_get_integrity(dm_disk(t->md))) + return; + + list_for_each_entry(dd, devices, list) { + if (prev && + blk_integrity_compare(prev->dm_dev.bdev->bd_disk, + dd->dm_dev.bdev->bd_disk) < 0) { + DMWARN("%s: integrity not set: %s and %s mismatch", + dm_device_name(t->md), + prev->dm_dev.bdev->bd_disk->disk_name, + dd->dm_dev.bdev->bd_disk->disk_name); + goto no_integrity; + } + prev = dd; + } + + if (!prev || !bdev_get_integrity(prev->dm_dev.bdev)) + goto no_integrity; + + blk_integrity_register(dm_disk(t->md), + bdev_get_integrity(prev->dm_dev.bdev)); + + return; + +no_integrity: + blk_integrity_register(dm_disk(t->md), NULL); + + return; +} + void dm_table_set_restrictions(struct dm_table *t, struct request_queue *q) { /* @@ -897,6 +936,7 @@ void dm_table_set_restrictions(struct dm else queue_flag_set_unlocked(QUEUE_FLAG_CLUSTER, q); + dm_table_set_integrity(t); } unsigned int dm_table_get_num_targets(struct dm_table *t) diff --git a/drivers/md/dm.c b/drivers/md/dm.c --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -703,6 +703,12 @@ static struct bio *split_bvec(struct bio clone->bi_io_vec->bv_len = clone->bi_size; clone->bi_flags |= 1 << BIO_CLONED; + if (bio_integrity(bio)) { + bio_integrity_clone(clone, bio, bs); + bio_integrity_trim(clone, + bio_sector_offset(bio, idx, offset), len); + } + return clone; } @@ -723,6 +729,14 @@ static struct bio *clone_bio(struct bio clone->bi_vcnt = idx + bv_count; clone->bi_size = to_bytes(len); clone->bi_flags &= ~(1 << BIO_SEG_VALID); + + if (bio_integrity(bio)) { + bio_integrity_clone(clone, bio, bs); + + if (idx != bio->bi_idx || clone->bi_size < bio->bi_size) + bio_integrity_trim(clone, + bio_sector_offset(bio, idx, 0), len); + } return clone; } @@ -1191,6 +1205,7 @@ static void free_dev(struct mapped_devic mempool_destroy(md->tio_pool); mempool_destroy(md->io_pool); bioset_free(md->bs); + blk_integrity_unregister(md->disk); del_gendisk(md->disk); free_minor(minor);