From patchwork Wed Dec 23 20:31:30 2009 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Martin K. Petersen" X-Patchwork-Id: 69601 Received: from mx01.util.phx2.redhat.com (mx1-phx2.redhat.com [209.132.183.26]) by demeter.kernel.org (8.14.3/8.14.2) with ESMTP id nBNKZcqo022860 for ; Wed, 23 Dec 2009 20:35:39 GMT Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by mx01.util.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id nBNKXQFK028580; Wed, 23 Dec 2009 15:33:26 -0500 Received: from int-mx05.intmail.prod.int.phx2.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.18]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id nBNKWhGO009263 for ; Wed, 23 Dec 2009 15:32:43 -0500 Received: from mx1.redhat.com (ext-mx05.extmail.prod.ext.phx2.redhat.com [10.5.110.9]) by int-mx05.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id nBNKWcU2008544; Wed, 23 Dec 2009 15:32:38 -0500 Received: from acsinet12.oracle.com (acsinet12.oracle.com [141.146.126.234]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id nBNKWMTX008747; Wed, 23 Dec 2009 15:32:23 -0500 Received: from acsinet15.oracle.com (acsinet15.oracle.com [141.146.126.227]) by acsinet12.oracle.com (Switch-3.4.2/Switch-3.4.1) with ESMTP id nBNKWB3P013689 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Wed, 23 Dec 2009 20:32:12 GMT Received: from acsmt355.oracle.com (acsmt355.oracle.com [141.146.40.155]) by acsinet15.oracle.com (Switch-3.4.2/Switch-3.4.1) with ESMTP id nBNIoUkL007609; Wed, 23 Dec 2009 20:32:10 GMT Received: from abhmt007.oracle.com by acsmt357.oracle.com with ESMTP id 1210435871261600295; Wed, 23 Dec 2009 14:31:35 -0600 Received: from groovelator.mkp.net (/209.217.122.111) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 23 Dec 2009 12:31:34 -0800 From: "Martin K. Petersen" To: jens.axboe@oracle.com, snitzer@redhat.com, dm-devel@redhat.com Date: Wed, 23 Dec 2009 15:31:30 -0500 Message-Id: <1261600290-29627-3-git-send-email-martin.petersen@oracle.com> In-Reply-To: <20091223171351.GA17896@redhat.com> References: <20091223171351.GA17896@redhat.com> X-Source-IP: acsmt355.oracle.com [141.146.40.155] X-Auth-Type: Internal IP X-CT-RefId: str=0001.0A090203.4B327E4A.015E:SCFMA4539814,ss=1,fgs=0 X-RedHat-Spam-Score: -103.988 (AWL, RCVD_IN_DNSWL_MED, UNPARSEABLE_RELAY, USER_IN_WHITELIST) X-Scanned-By: MIMEDefang 2.67 on 10.5.11.18 X-Scanned-By: MIMEDefang 2.67 on 10.5.110.9 X-loop: dm-devel@redhat.com Cc: "Martin K. Petersen" Subject: [dm-devel] [PATCH 2/2] block: Fix topology stacking for data and discard alignment X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk Reply-To: device-mapper development List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com diff --git a/block/blk-settings.c b/block/blk-settings.c index 6ae118d..d52d4ad 100644 --- a/block/blk-settings.c +++ b/block/blk-settings.c @@ -505,21 +505,30 @@ static unsigned int lcm(unsigned int a, unsigned int b) /** * blk_stack_limits - adjust queue_limits for stacked devices - * @t: the stacking driver limits (top) - * @b: the underlying queue limits (bottom) + * @t: the stacking driver limits (top device) + * @b: the underlying queue limits (bottom, component device) * @offset: offset to beginning of data within component device * * Description: - * Merges two queue_limit structs. Returns 0 if alignment didn't - * change. Returns -1 if adding the bottom device caused - * misalignment. + * This function is used by stacking drivers like MD and DM to ensure + * that all component devices have compatible block sizes and + * alignments. The stacking driver must provide a queue_limits + * struct (top) and then iteratively call the stacking function for + * all component (bottom) devices. The stacking function will + * attempt to combine the values and ensure proper alignment. + * + * Returns 0 if the top and bottom queue_limits are compatible. The + * top device's block sizes and alignment offsets may be adjusted to + * ensure alignment with the bottom device. If no compatible sizes + * and alignments exist, -1 is returned and the resulting top + * queue_limits will have the misaligned flag set to indicate that + * the alignment_offset is undefined. */ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b, sector_t offset) { - int ret; - - ret = 0; + sector_t alignment; + unsigned int top, bottom; t->max_sectors = min_not_zero(t->max_sectors, b->max_sectors); t->max_hw_sectors = min_not_zero(t->max_hw_sectors, b->max_hw_sectors); @@ -537,6 +546,22 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b, t->max_segment_size = min_not_zero(t->max_segment_size, b->max_segment_size); + alignment = queue_limit_alignment_offset(b, offset); + + /* Bottom device has different alignment. Check that it is + * compatible with the current top alignment. + */ + if (t->alignment_offset != alignment) { + + top = max(t->physical_block_size, t->io_min) + + t->alignment_offset; + bottom = max(b->physical_block_size, b->io_min) + alignment; + + /* Verify that top and bottom intervals line up */ + if (max(top, bottom) & (min(top, bottom) - 1)) + t->misaligned = 1; + } + t->logical_block_size = max(t->logical_block_size, b->logical_block_size); @@ -544,54 +569,64 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b, b->physical_block_size); t->io_min = max(t->io_min, b->io_min); + t->io_opt = lcm(t->io_opt, b->io_opt); + t->no_cluster |= b->no_cluster; t->discard_zeroes_data &= b->discard_zeroes_data; - /* Bottom device offset aligned? */ - if (offset && - (offset & (b->physical_block_size - 1)) != b->alignment_offset) { + /* Physical block size a multiple of the logical block size? */ + if (t->physical_block_size & (t->logical_block_size - 1)) { + t->physical_block_size = t->logical_block_size; t->misaligned = 1; - ret = -1; } - /* - * Temporarily disable discard granularity. It's currently buggy - * since we default to 0 for discard_granularity, hence this - * "failure" will always trigger for non-zero offsets. - */ -#if 0 - if (offset && - (offset & (b->discard_granularity - 1)) != b->discard_alignment) { - t->discard_misaligned = 1; - ret = -1; + /* Minimum I/O a multiple of the physical block size? */ + if (t->io_min & (t->physical_block_size - 1)) { + t->io_min = t->physical_block_size; + t->misaligned = 1; } -#endif - - /* If top has no alignment offset, inherit from bottom */ - if (!t->alignment_offset) - t->alignment_offset = - b->alignment_offset & (b->physical_block_size - 1); - if (!t->discard_alignment) - t->discard_alignment = - b->discard_alignment & (b->discard_granularity - 1); - - /* Top device aligned on logical block boundary? */ - if (t->alignment_offset & (t->logical_block_size - 1)) { + /* Optimal I/O a multiple of the physical block size? */ + if (t->io_opt & (t->physical_block_size - 1)) { + t->io_opt = 0; t->misaligned = 1; - ret = -1; } - /* Find lcm() of optimal I/O size and granularity */ - t->io_opt = lcm(t->io_opt, b->io_opt); - t->discard_granularity = lcm(t->discard_granularity, - b->discard_granularity); + /* Find lowest common alignment_offset */ + t->alignment_offset = lcm(t->alignment_offset, alignment) + & (max(t->physical_block_size, t->io_min) - 1); - /* Verify that optimal I/O size is a multiple of io_min */ - if (t->io_min && t->io_opt % t->io_min) - ret = -1; + /* Verify that new alignment_offset is on a logical block boundary */ + if (t->alignment_offset & (t->logical_block_size - 1)) + t->misaligned = 1; + + /* Discard alignment and granularity */ + if (b->discard_granularity) { + unsigned int granularity = b->discard_granularity; + offset &= granularity - 1; + + alignment = (granularity + b->discard_alignment - offset) + & (granularity - 1); + + if (t->discard_granularity != 0 && + t->discard_alignment != alignment) { + top = t->discard_granularity + t->discard_alignment; + bottom = b->discard_granularity + alignment; + + /* Verify that top and bottom intervals line up */ + if (max(top, bottom) & (min(top, bottom) - 1)) + t->discard_misaligned = 1; + } + + t->max_discard_sectors = min_not_zero(t->max_discard_sectors, + b->max_discard_sectors); + t->discard_granularity = max(t->discard_granularity, + b->discard_granularity); + t->discard_alignment = lcm(t->discard_alignment, alignment) & + (t->discard_granularity - 1); + } - return ret; + return t->misaligned ? -1 : 0; } EXPORT_SYMBOL(blk_stack_limits);