diff mbox

[RFC] block: check partition alignment

Message ID 20161214164706.45543-1-sth@linux.vnet.ibm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Stefan Haberland Dec. 14, 2016, 4:47 p.m. UTC
Partitions that are not aligned to the blocksize of a device may cause
invalid I/O requests because the blocklayer cares only about alignment
within the partition when building requests on partitions.

device
|--------4096--------|--------4096--------|--------4096--------|
partition offset 512byte
|-512-|--------4096--------|--------4096--------|--------4096--------|

When reading/writing one 4k block of the partition this maps to
reading/writing with an offset of 512 byte of the device leading to
unaligned requests for the device which in turn may cause unexpected
behavior of the device driver.

For DASD devices we have to translate the block number into a cylinder,
head, record format. The unaligned requests lead to wrong calculation
and therefore to misdirected I/O. In a "good" case this leads to I/O
errors because the underlying hardware detects the wrong addressing.
In a worst case scenario this might destroy data on the device.

To prevent partitions that are not aligned to the physical blocksize
of a device check for the alignment in the blkpg_ioctl.

Signed-off-by: Stefan Haberland <sth@linux.vnet.ibm.com>
---
 block/ioctl.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Christoph Hellwig Dec. 14, 2016, 5:07 p.m. UTC | #1
> To prevent partitions that are not aligned to the physical blocksize
> of a device check for the alignment in the blkpg_ioctl.

We'd also need to reject this when reading partitions from disk, right?

> +			/* check if partition is aligned to blocksize */
> +			if (p.start % bdev_physical_block_size(bdev) != 0)

And this should be bdev_logical_block_size, as the logical block size
is the only thing that matters for the OS - exposing the physical block
size is just an optional hint to prevent users from doing stupid
things (like creating unaligned partitions :))
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Damien Le Moal Dec. 15, 2016, 12:29 a.m. UTC | #2
Stefan,

On 12/15/16 01:47, Stefan Haberland wrote:
> Partitions that are not aligned to the blocksize of a device may cause
> invalid I/O requests because the blocklayer cares only about alignment
> within the partition when building requests on partitions.
> 
> device
> |--------4096--------|--------4096--------|--------4096--------|
> partition offset 512byte
> |-512-|--------4096--------|--------4096--------|--------4096--------|
> 
> When reading/writing one 4k block of the partition this maps to
> reading/writing with an offset of 512 byte of the device leading to
> unaligned requests for the device which in turn may cause unexpected
> behavior of the device driver.
> 
> For DASD devices we have to translate the block number into a cylinder,
> head, record format. The unaligned requests lead to wrong calculation
> and therefore to misdirected I/O. In a "good" case this leads to I/O
> errors because the underlying hardware detects the wrong addressing.
> In a worst case scenario this might destroy data on the device.
> 
> To prevent partitions that are not aligned to the physical blocksize
> of a device check for the alignment in the blkpg_ioctl.
> 
> Signed-off-by: Stefan Haberland <sth@linux.vnet.ibm.com>
> ---
>  block/ioctl.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/block/ioctl.c b/block/ioctl.c
> index 755119c..8b83afee 100644
> --- a/block/ioctl.c
> +++ b/block/ioctl.c
> @@ -45,6 +45,9 @@ static int blkpg_ioctl(struct block_device *bdev, struct blkpg_ioctl_arg __user
>  				    || pstart < 0 || plength < 0 || partno > 65535)
>  					return -EINVAL;
>  			}
> +			/* check if partition is aligned to blocksize */
> +			if (p.start % bdev_physical_block_size(bdev) != 0)

sd.c ensures that the logical block size (sector size in sd.c) is a
power of 2 between 512 and 4096. So you can use:

if (p.start & (bdev_physical_block_size(bdev) - 1))

Or use div_u64_rem to avoid an error on 32 bits builds.

Best regards.
Damien Le Moal Dec. 15, 2016, 12:55 a.m. UTC | #3
> sd.c ensures that the logical block size (sector size in sd.c) is a
> power of 2 between 512 and 4096. So you can use:
> 
> if (p.start & (bdev_physical_block_size(bdev) - 1))

Sorry, that was a little too short as a complete proof:
sd.c ensures that the logical block size (sector size in sd.c) is a
power of 2 between 512 and 4096, and the physical block size is a power
of 2 number of logical blocks. So the physical block size is also always
a power of 2.

> 
> Or use div_u64_rem to avoid an error on 32 bits builds.
> 
> Best regards.
>
Damien Le Moal Dec. 15, 2016, 1:33 a.m. UTC | #4
Christoph,

On 12/15/16 02:07, Christoph Hellwig wrote:
>> To prevent partitions that are not aligned to the physical blocksize
>> of a device check for the alignment in the blkpg_ioctl.
> 
> We'd also need to reject this when reading partitions from disk, right?

Only for DASD devices, no ?

Logical block size aligned partitions are fine for regular block
devices. Not aligning on the physical block size is indeed very stupid,
but will not generate errors and an application can see that through
bdev_alignment_offset() and the sysfs alignment_offset file of the
partition.

> 
>> +			/* check if partition is aligned to blocksize */
>> +			if (p.start % bdev_physical_block_size(bdev) != 0)
> 
> And this should be bdev_logical_block_size, as the logical block size
> is the only thing that matters for the OS - exposing the physical block
> size is just an optional hint to prevent users from doing stupid
> things (like creating unaligned partitions :))

For a regular block device, I agree. But in Stephan case, I think that
the check really needs to be against the physical block size, with the
added condition that the bdev is a DASD device (similarly to the zone
alignment check for zoned block devices).

So this should become something like:

if (p.start & (bdev_logical_block_size(bdev) - 1))
	return -EINVAL;
if (bdev_is_dasd(bdev) &&
    p.start & (bdev_physical_block_size(bdev) - 1))
	return -EINVAL;

I am not sure however how bdev_is_dasd can be implemented though.

Best regards.
Christoph Hellwig Dec. 15, 2016, 8:45 a.m. UTC | #5
On Thu, Dec 15, 2016 at 10:33:47AM +0900, Damien Le Moal wrote:
> For a regular block device, I agree. But in Stephan case, I think that
> the check really needs to be against the physical block size, with the
> added condition that the bdev is a DASD device (similarly to the zone
> alignment check for zoned block devices).

Then they need to expose a chunk_size.  physical block size is defined
as not having a meaning for the kernel.
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Damien Le Moal Dec. 15, 2016, 8:56 a.m. UTC | #6
On 12/15/16 17:45, Christoph Hellwig wrote:
> On Thu, Dec 15, 2016 at 10:33:47AM +0900, Damien Le Moal wrote:
>> For a regular block device, I agree. But in Stephan case, I think that
>> the check really needs to be against the physical block size, with the
>> added condition that the bdev is a DASD device (similarly to the zone
>> alignment check for zoned block devices).
> 
> Then they need to expose a chunk_size.  physical block size is defined
> as not having a meaning for the kernel.

Or create the block device with the logical block size set to the
physical sector size. Which would be even more simple and in fact
correct in this case since only physically aligned accesses make sense
for DASD.
Stefan Haberland Dec. 19, 2016, 4:15 p.m. UTC | #7
Am 14.12.2016 um 18:07 schrieb Christoph Hellwig:
>> To prevent partitions that are not aligned to the physical blocksize
>> > of a device check for the alignment in the blkpg_ioctl.
> We'd also need to reject this when reading partitions from disk, right?
>

I am not sure if there is a problem. I was not able to recreate the error
with a partition read from disk. But simply because I was not able to
write a defective partition table to disk. All variants I tried where OK.
So I am at least not aware of a way to break it via the partition
detection code.
I just noticed that the ioctl which can be called by anyone is
able to establish defective partitions.


Am 15.12.2016 um 09:56 schrieb Damien Le Moal:
> On 12/15/16 17:45, Christoph Hellwig wrote:
>> On Thu, Dec 15, 2016 at 10:33:47AM +0900, Damien Le Moal wrote:
>>> For a regular block device, I agree. But in Stephan case, I think that
>>> the check really needs to be against the physical block size, with the
>>> added condition that the bdev is a DASD device (similarly to the zone
>>> alignment check for zoned block devices).
>>
>> Then they need to expose a chunk_size.  physical block size is defined
>> as not having a meaning for the kernel.
>
> Or create the block device with the logical block size set to the
> physical sector size. Which would be even more simple and in fact
> correct in this case since only physically aligned accesses make sense
> for DASD.
>

We already do it this way. So the logical blocksize is fine for DASD.
I just was not aware of the fact that the physical blocksize is a 
best_can_do field. So I changed it this way and also incorporated your 
other feedback regarding the modulo. Here is the second suggestion for
the patch.

Stefan Haberland (1):
  block: check partition alignment

 block/ioctl.c | 3 +++
 1 file changed, 3 insertions(+)
diff mbox

Patch

diff --git a/block/ioctl.c b/block/ioctl.c
index 755119c..8b83afee 100644
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -45,6 +45,9 @@  static int blkpg_ioctl(struct block_device *bdev, struct blkpg_ioctl_arg __user
 				    || pstart < 0 || plength < 0 || partno > 65535)
 					return -EINVAL;
 			}
+			/* check if partition is aligned to blocksize */
+			if (p.start % bdev_physical_block_size(bdev) != 0)
+				return -EINVAL;
 
 			mutex_lock(&bdev->bd_mutex);