Message ID | 20191014155030.GS13108@magnolia (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v3] loop: fix no-unmap write-zeroes request behavior | expand |
On 10/14/19 10:50 AM, Darrick J. Wong wrote: > From: Darrick J. Wong <darrick.wong@oracle.com> > > Currently, if the loop device receives a WRITE_ZEROES request, it asks > the underlying filesystem to punch out the range. This behavior is > correct if unmapping is allowed. However, a NOUNMAP request means that > the caller doesn't want us to free the storage backing the range, so > punching out the range is incorrect behavior. > > To satisfy a NOUNMAP | WRITE_ZEROES request, loop should ask the > underlying filesystem to FALLOC_FL_ZERO_RANGE, which is (according to > the fallocate documentation) required to ensure that the entire range is > backed by real storage, which suffices for our purposes. > > Fixes: 19372e2769179dd ("loop: implement REQ_OP_WRITE_ZEROES") > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> > --- > v3: refactor into a single fallocate function > v2: reorganize a little according to hch feedback > --- > drivers/block/loop.c | 26 ++++++++++++++++++-------- > 1 file changed, 18 insertions(+), 8 deletions(-) > > diff --git a/drivers/block/loop.c b/drivers/block/loop.c > index f6f77eaa7217..ef6e251857c8 100644 > --- a/drivers/block/loop.c > +++ b/drivers/block/loop.c > @@ -417,18 +417,20 @@ static int lo_read_transfer(struct loop_device *lo, struct request *rq, > return ret; > } > > -static int lo_discard(struct loop_device *lo, struct request *rq, loff_t pos) > +static int lo_fallocate(struct loop_device *lo, struct request *rq, loff_t pos, > + int mode) > { > /* > - * We use punch hole to reclaim the free space used by the > - * image a.k.a. discard. However we do not support discard if > - * encryption is enabled, because it may give an attacker > - * useful information. > + * We use fallocate to manipulate the space mappings used by the image > + * a.k.a. discard/zerorange. However we do not support this if > + * encryption is enabled, because it may give an attacker useful > + * information. > */ > struct file *file = lo->lo_backing_file; > - int mode = FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE; > int ret; > > + mode |= FALLOC_FL_KEEP_SIZE; > + > if ((!file->f_op->fallocate) || lo->lo_encrypt_key_size) { > ret = -EOPNOTSUPP; > goto out; > @@ -596,9 +598,17 @@ static int do_req_filebacked(struct loop_device *lo, struct request *rq) > switch (req_op(rq)) { > case REQ_OP_FLUSH: > return lo_req_flush(lo, rq); > - case REQ_OP_DISCARD: > case REQ_OP_WRITE_ZEROES: > - return lo_discard(lo, rq, pos); cxz ÿbvVBV > + case REQ_OP_DISCARD: > + return lo_fallocate(lo, rq, pos, FALLOC_FL_PUNCH_HOLE); I get lost in the twisty passages. What happens if the filesystem hosting the backing file doesn't support fallocate, and REQ_OP_DISCARD / REQ_OP_WRITE_ZEROES returns EOPNOTSUPP - discard is advisory, is it ok to fail REQ_OP_WRITE_ZEROES? Does something at another layer fall back to writing zeros? -Eric > case REQ_OP_WRITE: > if (lo->transfer) > return lo_write_transfer(lo, rq, pos); >
On Mon, Oct 14, 2019 at 11:39:43AM -0500, Eric Sandeen wrote: > On 10/14/19 10:50 AM, Darrick J. Wong wrote: > > From: Darrick J. Wong <darrick.wong@oracle.com> > > > > Currently, if the loop device receives a WRITE_ZEROES request, it asks > > the underlying filesystem to punch out the range. This behavior is > > correct if unmapping is allowed. However, a NOUNMAP request means that > > the caller doesn't want us to free the storage backing the range, so > > punching out the range is incorrect behavior. > > > > To satisfy a NOUNMAP | WRITE_ZEROES request, loop should ask the > > underlying filesystem to FALLOC_FL_ZERO_RANGE, which is (according to > > the fallocate documentation) required to ensure that the entire range is > > backed by real storage, which suffices for our purposes. > > > > Fixes: 19372e2769179dd ("loop: implement REQ_OP_WRITE_ZEROES") > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> > > --- > > v3: refactor into a single fallocate function > > v2: reorganize a little according to hch feedback > > --- > > drivers/block/loop.c | 26 ++++++++++++++++++-------- > > 1 file changed, 18 insertions(+), 8 deletions(-) > > > > diff --git a/drivers/block/loop.c b/drivers/block/loop.c > > index f6f77eaa7217..ef6e251857c8 100644 > > --- a/drivers/block/loop.c > > +++ b/drivers/block/loop.c > > @@ -417,18 +417,20 @@ static int lo_read_transfer(struct loop_device *lo, struct request *rq, > > return ret; > > } > > -static int lo_discard(struct loop_device *lo, struct request *rq, loff_t pos) > > +static int lo_fallocate(struct loop_device *lo, struct request *rq, loff_t pos, > > + int mode) > > { > > /* > > - * We use punch hole to reclaim the free space used by the > > - * image a.k.a. discard. However we do not support discard if > > - * encryption is enabled, because it may give an attacker > > - * useful information. > > + * We use fallocate to manipulate the space mappings used by the image > > + * a.k.a. discard/zerorange. However we do not support this if > > + * encryption is enabled, because it may give an attacker useful > > + * information. > > */ > > struct file *file = lo->lo_backing_file; > > - int mode = FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE; > > int ret; > > + mode |= FALLOC_FL_KEEP_SIZE; > > + > > if ((!file->f_op->fallocate) || lo->lo_encrypt_key_size) { > > ret = -EOPNOTSUPP; > > goto out; > > @@ -596,9 +598,17 @@ static int do_req_filebacked(struct loop_device *lo, struct request *rq) > > switch (req_op(rq)) { > > case REQ_OP_FLUSH: > > return lo_req_flush(lo, rq); > > - case REQ_OP_DISCARD: > > case REQ_OP_WRITE_ZEROES: > > - return lo_discard(lo, rq, pos); > cxz ÿbvVBV Yes. > > + case REQ_OP_DISCARD: > > + return lo_fallocate(lo, rq, pos, FALLOC_FL_PUNCH_HOLE); > > I get lost in the twisty passages. What happens if the filesystem hosting the > backing file doesn't support fallocate, and REQ_OP_DISCARD / REQ_OP_WRITE_ZEROES > returns EOPNOTSUPP - discard is advisory, is it ok to fail REQ_OP_WRITE_ZEROES? > Does something at another layer fall back to writing zeros? If the REQ_OP_WRITE_ZEROES request was initiated by blkdev_issue_zeroout and we send back an error code, blkdev_issue_zeroout will fall back to writing zeroes if BLKDEV_ZERO_NOFALLBACK wasn't set its caller. Note that calling FALLOC_FL_ZERO_RANGE on a block device will generate a REQ_OP_WRITE_ZEROES | REQ_OP_NOUNMAP request, which means that it will try fallocate zeroing and fall back to writing zeroes. --D > > -Eric > > > case REQ_OP_WRITE: > > if (lo->transfer) > > return lo_write_transfer(lo, rq, pos); > >
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
diff --git a/drivers/block/loop.c b/drivers/block/loop.c index f6f77eaa7217..ef6e251857c8 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -417,18 +417,20 @@ static int lo_read_transfer(struct loop_device *lo, struct request *rq, return ret; } -static int lo_discard(struct loop_device *lo, struct request *rq, loff_t pos) +static int lo_fallocate(struct loop_device *lo, struct request *rq, loff_t pos, + int mode) { /* - * We use punch hole to reclaim the free space used by the - * image a.k.a. discard. However we do not support discard if - * encryption is enabled, because it may give an attacker - * useful information. + * We use fallocate to manipulate the space mappings used by the image + * a.k.a. discard/zerorange. However we do not support this if + * encryption is enabled, because it may give an attacker useful + * information. */ struct file *file = lo->lo_backing_file; - int mode = FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE; int ret; + mode |= FALLOC_FL_KEEP_SIZE; + if ((!file->f_op->fallocate) || lo->lo_encrypt_key_size) { ret = -EOPNOTSUPP; goto out; @@ -596,9 +598,17 @@ static int do_req_filebacked(struct loop_device *lo, struct request *rq) switch (req_op(rq)) { case REQ_OP_FLUSH: return lo_req_flush(lo, rq); - case REQ_OP_DISCARD: case REQ_OP_WRITE_ZEROES: - return lo_discard(lo, rq, pos); + /* + * If the caller doesn't want deallocation, call zeroout to + * write zeroes the range. Otherwise, punch them out. + */ + return lo_fallocate(lo, rq, pos, + (rq->cmd_flags & REQ_NOUNMAP) ? + FALLOC_FL_ZERO_RANGE : + FALLOC_FL_PUNCH_HOLE); + case REQ_OP_DISCARD: + return lo_fallocate(lo, rq, pos, FALLOC_FL_PUNCH_HOLE); case REQ_OP_WRITE: if (lo->transfer) return lo_write_transfer(lo, rq, pos);