Message ID | 20210416030528.757513-4-damien.lemoal@wdc.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Fix dm-crypt zoned block device support | expand |
On Fri, Apr 16, 2021 at 12:05:27PM +0900, Damien Le Moal wrote: > From: Johannes Thumshirn <johannes.thumshirn@wdc.com> > > For zoned btrfs, zone append is mandatory to write to a sequential write > only zone, otherwise parallel writes to the same zone could result in > unaligned write errors. > > If a zoned block device does not support zone append (e.g. a dm-crypt > zoned device using a non-NULL IV cypher), fail to mount. > > Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> > Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Added to misc-next, thanks. I'll queue it for 5.13, it's not an urgent fix for 5.12 release but i'll tag it as stable so it'll apear in 5.12.x later.
On Fri, Apr 16, 2021 at 06:17:21PM +0200, David Sterba wrote: > On Fri, Apr 16, 2021 at 12:05:27PM +0900, Damien Le Moal wrote: > > From: Johannes Thumshirn <johannes.thumshirn@wdc.com> > > > > For zoned btrfs, zone append is mandatory to write to a sequential write > > only zone, otherwise parallel writes to the same zone could result in > > unaligned write errors. > > > > If a zoned block device does not support zone append (e.g. a dm-crypt > > zoned device using a non-NULL IV cypher), fail to mount. > > > > Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> > > Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> > > Added to misc-next, thanks. I'll queue it for 5.13, it's not an urgent > fix for 5.12 release but i'll tag it as stable so it'll apear in 5.12.x > later. Please don't. Zone append is a strict requirement for zoned devices, no need to add cargo cult code like this everywhere.
On 2021/04/19 18:29, Christoph Hellwig wrote: > On Fri, Apr 16, 2021 at 06:17:21PM +0200, David Sterba wrote: >> On Fri, Apr 16, 2021 at 12:05:27PM +0900, Damien Le Moal wrote: >>> From: Johannes Thumshirn <johannes.thumshirn@wdc.com> >>> >>> For zoned btrfs, zone append is mandatory to write to a sequential write >>> only zone, otherwise parallel writes to the same zone could result in >>> unaligned write errors. >>> >>> If a zoned block device does not support zone append (e.g. a dm-crypt >>> zoned device using a non-NULL IV cypher), fail to mount. >>> >>> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> >>> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> >> >> Added to misc-next, thanks. I'll queue it for 5.13, it's not an urgent >> fix for 5.12 release but i'll tag it as stable so it'll apear in 5.12.x >> later. > > Please don't. Zone append is a strict requirement for zoned devices, > no need to add cargo cult code like this everywhere. This is only to avoid someone from running zoned-btrfs on top of dm-crypt. Without this patch, mount will be OK and file data writes will also actually be OK. But all reads will miserably fail... I would rather have this patch in than deal with the "bug reports" about btrfs failing to read files. No ? Note that like you, I dislike having to add such code. But it was my oversight when I worked on getting dm-crypt to work on zoned drives. Zone append was overlooked at that time... My bad, really.
On Mon, Apr 19, 2021 at 09:35:37AM +0000, Damien Le Moal wrote: > This is only to avoid someone from running zoned-btrfs on top of dm-crypt. > Without this patch, mount will be OK and file data writes will also actually be > OK. But all reads will miserably fail... I would rather have this patch in than > deal with the "bug reports" about btrfs failing to read files. No ? > > Note that like you, I dislike having to add such code. But it was my oversight > when I worked on getting dm-crypt to work on zoned drives. Zone append was > overlooked at that time... My bad, really. dm-crypt needs to stop pretending it supports zoned devices if it doesn't. Note that dm-crypt could fairly trivially support zone append by doing the same kind of emulation that the sd driver does.
On 19/04/2021 11:29, Christoph Hellwig wrote: > On Fri, Apr 16, 2021 at 06:17:21PM +0200, David Sterba wrote: >> On Fri, Apr 16, 2021 at 12:05:27PM +0900, Damien Le Moal wrote: >>> From: Johannes Thumshirn <johannes.thumshirn@wdc.com> >>> >>> For zoned btrfs, zone append is mandatory to write to a sequential write >>> only zone, otherwise parallel writes to the same zone could result in >>> unaligned write errors. >>> >>> If a zoned block device does not support zone append (e.g. a dm-crypt >>> zoned device using a non-NULL IV cypher), fail to mount. >>> >>> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> >>> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> >> >> Added to misc-next, thanks. I'll queue it for 5.13, it's not an urgent >> fix for 5.12 release but i'll tag it as stable so it'll apear in 5.12.x >> later. > > Please don't. Zone append is a strict requirement for zoned devices, > no need to add cargo cult code like this everywhere. > As of now, dm-crypt supports zoned devices but cannot really work with zone append. At least not with ciphers that use sectors as IV. OTOH btrfs cannot work without zone append. While we won't notice any problems with writes, reads (or better decrypt) will fail.
On 2021/04/19 18:41, hch@infradead.org wrote: > On Mon, Apr 19, 2021 at 09:35:37AM +0000, Damien Le Moal wrote: >> This is only to avoid someone from running zoned-btrfs on top of dm-crypt. >> Without this patch, mount will be OK and file data writes will also actually be >> OK. But all reads will miserably fail... I would rather have this patch in than >> deal with the "bug reports" about btrfs failing to read files. No ? >> >> Note that like you, I dislike having to add such code. But it was my oversight >> when I worked on getting dm-crypt to work on zoned drives. Zone append was >> overlooked at that time... My bad, really. > > dm-crypt needs to stop pretending it supports zoned devices if it > doesn't. Note that dm-crypt could fairly trivially support zone append > by doing the same kind of emulation that the sd driver does. I am not so sure about the "trivial" but yes, it is feasible. Let me think about something then. Whatever we do, performance with ZNS will no be great, for sure... But for SMR HDDs, we likely will not notice any difference in performance.
On Mon, Apr 19, 2021 at 09:46:36AM +0000, Damien Le Moal wrote: > On 2021/04/19 18:41, hch@infradead.org wrote: > > On Mon, Apr 19, 2021 at 09:35:37AM +0000, Damien Le Moal wrote: > >> This is only to avoid someone from running zoned-btrfs on top of dm-crypt. > >> Without this patch, mount will be OK and file data writes will also actually be > >> OK. But all reads will miserably fail... I would rather have this patch in than > >> deal with the "bug reports" about btrfs failing to read files. No ? > >> > >> Note that like you, I dislike having to add such code. But it was my oversight > >> when I worked on getting dm-crypt to work on zoned drives. Zone append was > >> overlooked at that time... My bad, really. > > > > dm-crypt needs to stop pretending it supports zoned devices if it > > doesn't. Note that dm-crypt could fairly trivially support zone append > > by doing the same kind of emulation that the sd driver does. > > I am not so sure about the "trivial" but yes, it is feasible. Let me think about > something then. Whatever we do, performance with ZNS will no be great, for > sure... But for SMR HDDs, we likely will not notice any difference in performance. So this needs to be fixed outside of btrfs. The fix in btrfs would make sense in case we can't sync the dm-crypt and btrfs in a released kernel. Having a mount check sounds like a better option to me than to fail reads, we can revert it in a release once everything woks as expected.
diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c index eeb3ebe11d7a..70b23a0d03b1 100644 --- a/fs/btrfs/zoned.c +++ b/fs/btrfs/zoned.c @@ -342,6 +342,13 @@ int btrfs_get_dev_zone_info(struct btrfs_device *device) if (!IS_ALIGNED(nr_sectors, zone_sectors)) zone_info->nr_zones++; + if (bdev_is_zoned(bdev) && zone_info->max_zone_append_size == 0) { + btrfs_err(fs_info, "zoned: device %pg does not support zone append", + bdev); + ret = -EINVAL; + goto out; + } + zone_info->seq_zones = bitmap_zalloc(zone_info->nr_zones, GFP_KERNEL); if (!zone_info->seq_zones) { ret = -ENOMEM;