diff mbox series

[RFC,6/9] dm-linear: add llseek(SEEK_HOLE/SEEK_DATA) support

Message ID 20240328203910.2370087-7-stefanha@redhat.com (mailing list archive)
State Deferred, archived
Delegated to: Mike Snitzer
Headers show
Series block: add llseek(SEEK_HOLE/SEEK_DATA) support | expand

Commit Message

Stefan Hajnoczi March 28, 2024, 8:39 p.m. UTC
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 drivers/md/dm-linear.c | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

Comments

Eric Blake March 29, 2024, 12:54 a.m. UTC | #1
On Thu, Mar 28, 2024 at 04:39:07PM -0400, Stefan Hajnoczi wrote:
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  drivers/md/dm-linear.c | 25 +++++++++++++++++++++++++
>  1 file changed, 25 insertions(+)
> 
> diff --git a/drivers/md/dm-linear.c b/drivers/md/dm-linear.c
> index 2d3e186ca87e3..9b6cdfa4f951d 100644
> --- a/drivers/md/dm-linear.c
> +++ b/drivers/md/dm-linear.c
> @@ -147,6 +147,30 @@ static int linear_report_zones(struct dm_target *ti,
>  #define linear_report_zones NULL
>  #endif
>  
> +static loff_t linear_seek_hole_data(struct dm_target *ti, loff_t offset,
> +		int whence)
> +{
> +	struct linear_c *lc = ti->private;
> +	loff_t ti_begin = ti->begin << SECTOR_SHIFT;
> +	loff_t ti_len = ti->len << SECTOR_SHIFT;
> +	loff_t bdev_start = lc->start << SECTOR_SHIFT;
> +	loff_t bdev_offset;

Okay, given my questions in 4/9, it looks like your intent is that
each callback for dm_seek_hole_data will obey its own ti-> limits.

> +
> +	/* TODO underflow/overflow? */
> +	bdev_offset = offset - ti_begin + bdev_start;
> +
> +	bdev_offset = blkdev_seek_hole_data(lc->dev->bdev, bdev_offset,
> +					    whence);
> +	if (bdev_offset < 0)
> +		return bdev_offset;
> +
> +	offset = bdev_offset - bdev_start;
> +	if (offset >= ti_len)
> +		return whence == SEEK_DATA ? -ENXIO : ti_begin + ti_len;

However, this is inconsistent with dm_blk_seek_hole_data_default in
4/9; I think you want to unconditionally return -ENXIO here, and let
the caller figure out when to turn -ENXIO back into end to proceed
with the next ti in the list.

OR, you may want to document the semantics that dm_seek_hole_data
callbacks must NOT return -ENXIO, but always return ti_begin + ti_len
when the answer (either SEEK_HOLE or SEEK_END) did not lie within the
current ti - it is DIFFERENT than the semantics for
blkdev_seek_hole_data, but gets normalized back into the expected
-ENXIO answer when dm_blk_do_seek_hole_data finally advances past the
last ti.

At any rate, I know this is an RFC series, but it goes to show that
comments will be essential, whichever interface you decide for
callbacks to honor (both a guarantee that callbacks will only ever see
SEEK_HOLE/SEEK_DATA in bounds, because earlier points in the call
stack have filtered out out-of-bounds and SEEK_SET; and constraints on
what the return value(s) must be for the various callbacks, especially
if it is different from the eventual return value of the overall
llseek syscall)

> +
> +	return offset + ti_begin;
> +}
> +
>  static int linear_iterate_devices(struct dm_target *ti,
>  				  iterate_devices_callout_fn fn, void *data)
>  {
> @@ -212,6 +236,7 @@ static struct target_type linear_target = {
>  	.direct_access = linear_dax_direct_access,
>  	.dax_zero_page_range = linear_dax_zero_page_range,
>  	.dax_recovery_write = linear_dax_recovery_write,
> +	.seek_hole_data = linear_seek_hole_data,
>  };
>  
>  int __init dm_linear_init(void)
> -- 
> 2.44.0
>
Stefan Hajnoczi April 3, 2024, 2:22 p.m. UTC | #2
On Thu, Mar 28, 2024 at 07:54:21PM -0500, Eric Blake wrote:
> On Thu, Mar 28, 2024 at 04:39:07PM -0400, Stefan Hajnoczi wrote:
> > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> > ---
> >  drivers/md/dm-linear.c | 25 +++++++++++++++++++++++++
> >  1 file changed, 25 insertions(+)
> > 
> > diff --git a/drivers/md/dm-linear.c b/drivers/md/dm-linear.c
> > index 2d3e186ca87e3..9b6cdfa4f951d 100644
> > --- a/drivers/md/dm-linear.c
> > +++ b/drivers/md/dm-linear.c
> > @@ -147,6 +147,30 @@ static int linear_report_zones(struct dm_target *ti,
> >  #define linear_report_zones NULL
> >  #endif
> >  
> > +static loff_t linear_seek_hole_data(struct dm_target *ti, loff_t offset,
> > +		int whence)
> > +{
> > +	struct linear_c *lc = ti->private;
> > +	loff_t ti_begin = ti->begin << SECTOR_SHIFT;
> > +	loff_t ti_len = ti->len << SECTOR_SHIFT;
> > +	loff_t bdev_start = lc->start << SECTOR_SHIFT;
> > +	loff_t bdev_offset;
> 
> Okay, given my questions in 4/9, it looks like your intent is that
> each callback for dm_seek_hole_data will obey its own ti-> limits.

Yes, exactly.

> 
> > +
> > +	/* TODO underflow/overflow? */
> > +	bdev_offset = offset - ti_begin + bdev_start;
> > +
> > +	bdev_offset = blkdev_seek_hole_data(lc->dev->bdev, bdev_offset,
> > +					    whence);
> > +	if (bdev_offset < 0)
> > +		return bdev_offset;
> > +
> > +	offset = bdev_offset - bdev_start;
> > +	if (offset >= ti_len)
> > +		return whence == SEEK_DATA ? -ENXIO : ti_begin + ti_len;
> 
> However, this is inconsistent with dm_blk_seek_hole_data_default in
> 4/9; I think you want to unconditionally return -ENXIO here, and let
> the caller figure out when to turn -ENXIO back into end to proceed
> with the next ti in the list.
> 
> OR, you may want to document the semantics that dm_seek_hole_data
> callbacks must NOT return -ENXIO, but always return ti_begin + ti_len
> when the answer (either SEEK_HOLE or SEEK_END) did not lie within the
> current ti - it is DIFFERENT than the semantics for
> blkdev_seek_hole_data, but gets normalized back into the expected
> -ENXIO answer when dm_blk_do_seek_hole_data finally advances past the
> last ti.
> 
> At any rate, I know this is an RFC series, but it goes to show that
> comments will be essential, whichever interface you decide for
> callbacks to honor (both a guarantee that callbacks will only ever see
> SEEK_HOLE/SEEK_DATA in bounds, because earlier points in the call
> stack have filtered out out-of-bounds and SEEK_SET; and constraints on
> what the return value(s) must be for the various callbacks, especially
> if it is different from the eventual return value of the overall
> llseek syscall)

It's easier to understand the code when lseek function has the same
semantics, so I'd rather not customize the semantics for certain lseek
functions.

I'll make sure that the device-mapper targets match the
dm_blk_seek_hole_data_default() behavior. To be honest, I relied on dm.c
always passing offset values that are within the target, but that in
itself is customizing the semantics :).
diff mbox series

Patch

diff --git a/drivers/md/dm-linear.c b/drivers/md/dm-linear.c
index 2d3e186ca87e3..9b6cdfa4f951d 100644
--- a/drivers/md/dm-linear.c
+++ b/drivers/md/dm-linear.c
@@ -147,6 +147,30 @@  static int linear_report_zones(struct dm_target *ti,
 #define linear_report_zones NULL
 #endif
 
+static loff_t linear_seek_hole_data(struct dm_target *ti, loff_t offset,
+		int whence)
+{
+	struct linear_c *lc = ti->private;
+	loff_t ti_begin = ti->begin << SECTOR_SHIFT;
+	loff_t ti_len = ti->len << SECTOR_SHIFT;
+	loff_t bdev_start = lc->start << SECTOR_SHIFT;
+	loff_t bdev_offset;
+
+	/* TODO underflow/overflow? */
+	bdev_offset = offset - ti_begin + bdev_start;
+
+	bdev_offset = blkdev_seek_hole_data(lc->dev->bdev, bdev_offset,
+					    whence);
+	if (bdev_offset < 0)
+		return bdev_offset;
+
+	offset = bdev_offset - bdev_start;
+	if (offset >= ti_len)
+		return whence == SEEK_DATA ? -ENXIO : ti_begin + ti_len;
+
+	return offset + ti_begin;
+}
+
 static int linear_iterate_devices(struct dm_target *ti,
 				  iterate_devices_callout_fn fn, void *data)
 {
@@ -212,6 +236,7 @@  static struct target_type linear_target = {
 	.direct_access = linear_dax_direct_access,
 	.dax_zero_page_range = linear_dax_zero_page_range,
 	.dax_recovery_write = linear_dax_recovery_write,
+	.seek_hole_data = linear_seek_hole_data,
 };
 
 int __init dm_linear_init(void)