Message ID | 20200420080409.111693-5-maco@android.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Add a new LOOP_SET_FD_AND_STATUS ioctl. | expand |
On Mon, Apr 20, 2020 at 10:04:09AM +0200, Martijn Coenen wrote: > This allows userspace to completely setup a loop device with a single > ioctl, removing the in-between state where the device can be partially > configured - eg the loop device has a backing file associated with it, > but is reading from the wrong offset. > > Besides removing the intermediate state, another big benefit of this > ioctl is that LOOP_SET_STATUS can be slow; the main reason for this > slowness is that LOOP_SET_STATUS(64) calls blk_mq_freeze_queue() to > freeze the associated queue; this requires waiting for RCU > synchronization, which I've measured can take about 15-20ms on this > device on average. > > Here's setting up ~70 regular loop devices with an offset on an x86 > Android device, using LOOP_SET_FD and LOOP_SET_STATUS: > > vsoc_x86:/system/apex # time for i in `seq 30 100`; > do losetup -r -o 4096 /dev/block/loop$i com.android.adbd.apex; done > 0m03.40s real 0m00.02s user 0m00.03s system > > Here's configuring ~70 devices in the same way, but using a modified > losetup that uses the new LOOP_SET_FD_AND_STATUS ioctl: > > vsoc_x86:/system/apex # time for i in `seq 30 100`; > do losetup -r -o 4096 /dev/block/loop$i com.android.adbd.apex; done > 0m01.94s real 0m00.01s user 0m00.01s system > > Signed-off-by: Martijn Coenen <maco@android.com> > --- > drivers/block/loop.c | 47 ++++++++++++++++++++++++++++++--------- > include/uapi/linux/loop.h | 6 +++++ > 2 files changed, 42 insertions(+), 11 deletions(-) > > diff --git a/drivers/block/loop.c b/drivers/block/loop.c > index 6e656390b285..e1dbd70d6d6e 100644 > --- a/drivers/block/loop.c > +++ b/drivers/block/loop.c > @@ -1065,8 +1065,9 @@ loop_set_from_status(struct loop_device *lo, const struct loop_info64 *info) > return 0; > } > > -static int loop_set_fd(struct loop_device *lo, fmode_t mode, > - struct block_device *bdev, unsigned int arg) > +static int loop_set_fd_and_status(struct loop_device *lo, fmode_t mode, > + struct block_device *bdev, unsigned int fd, > + const struct loop_info64 *info) > { > struct file *file; > struct inode *inode; > @@ -1081,7 +1082,7 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode, > __module_get(THIS_MODULE); > > error = -EBADF; > - file = fget(arg); > + file = fget(fd); > if (!file) > goto out; > > @@ -1090,7 +1091,7 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode, > * here to avoid changing device under exclusive owner. > */ > if (!(mode & FMODE_EXCL)) { > - claimed_bdev = bd_start_claiming(bdev, loop_set_fd); > + claimed_bdev = bd_start_claiming(bdev, loop_set_fd_and_status); > if (IS_ERR(claimed_bdev)) { > error = PTR_ERR(claimed_bdev); > goto out_putf; > @@ -1117,9 +1118,24 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode, > lo_flags |= LO_FLAGS_READ_ONLY; > > error = -EFBIG; > - size = get_loop_size(lo, file); > + if (info) > + size = get_size(info->lo_offset, info->lo_sizelimit, > + file); > + else > + size = get_loop_size(lo, file); > if ((loff_t)(sector_t)size != size) > goto out_unlock; > + > + if (info) { > + error = loop_set_from_status(lo, info); > + if (error) > + goto out_unlock; > + } else { > + lo->transfer = NULL; > + lo->ioctl = NULL; > + lo->lo_sizelimit = 0; > + lo->lo_offset = 0; > + } Just curious: Can't we just pass in an on-stack info for the legacy case and avoid all these conditionals? > out: > @@ -1653,7 +1666,18 @@ static int lo_ioctl(struct block_device *bdev, fmode_t mode, > > switch (cmd) { > case LOOP_SET_FD: > - return loop_set_fd(lo, mode, bdev, arg); > + return loop_set_fd_and_status(lo, mode, bdev, arg, NULL); > + case LOOP_SET_FD_AND_STATUS: { > + struct loop_fd_and_status fds; > + > + if (copy_from_user(&fds, > + (struct loop_fd_and_status __user *) arg, > + sizeof(fds))) > + return -EFAULT; > + > + return loop_set_fd_and_status(lo, mode, bdev, fds.fd, > + &fds.info); > + } Can you throw in another prep patch that adds a: void __user *argp = (void __user *)arg; line at the top of lo_compat_ioctl, and switches the LOOP_SET_STATUS and LOOP_GET_STATUS case to it? The nhere you can just do: if (copy_from_user(&p, argp, sizeof(fds))) return -EFAULT; return loop_set_fd_and_status(lo, mode, bdev, p.fd, &p.info); and be done. > +struct loop_fd_and_status { > + struct loop_info64 info; > + __u32 fd; This should grow a __u32 __pad; to avoid different struct sizes on x86-32 vs x86-64.
On Wed, Apr 22, 2020 at 8:19 AM Christoph Hellwig <hch@lst.de> wrote: > > On Mon, Apr 20, 2020 at 10:04:09AM +0200, Martijn Coenen wrote: > > This allows userspace to completely setup a loop device with a single > > ioctl, removing the in-between state where the device can be partially > > configured - eg the loop device has a backing file associated with it, > > but is reading from the wrong offset. > > > > Besides removing the intermediate state, another big benefit of this > > ioctl is that LOOP_SET_STATUS can be slow; the main reason for this > > slowness is that LOOP_SET_STATUS(64) calls blk_mq_freeze_queue() to > > freeze the associated queue; this requires waiting for RCU > > synchronization, which I've measured can take about 15-20ms on this > > device on average. > > > > Here's setting up ~70 regular loop devices with an offset on an x86 > > Android device, using LOOP_SET_FD and LOOP_SET_STATUS: > > > > vsoc_x86:/system/apex # time for i in `seq 30 100`; > > do losetup -r -o 4096 /dev/block/loop$i com.android.adbd.apex; done > > 0m03.40s real 0m00.02s user 0m00.03s system > > > > Here's configuring ~70 devices in the same way, but using a modified > > losetup that uses the new LOOP_SET_FD_AND_STATUS ioctl: > > > > vsoc_x86:/system/apex # time for i in `seq 30 100`; > > do losetup -r -o 4096 /dev/block/loop$i com.android.adbd.apex; done > > 0m01.94s real 0m00.01s user 0m00.01s system > > > > Signed-off-by: Martijn Coenen <maco@android.com> > > --- > > drivers/block/loop.c | 47 ++++++++++++++++++++++++++++++--------- > > include/uapi/linux/loop.h | 6 +++++ > > 2 files changed, 42 insertions(+), 11 deletions(-) > > > > diff --git a/drivers/block/loop.c b/drivers/block/loop.c > > index 6e656390b285..e1dbd70d6d6e 100644 > > --- a/drivers/block/loop.c > > +++ b/drivers/block/loop.c > > @@ -1065,8 +1065,9 @@ loop_set_from_status(struct loop_device *lo, const struct loop_info64 *info) > > return 0; > > } > > > > -static int loop_set_fd(struct loop_device *lo, fmode_t mode, > > - struct block_device *bdev, unsigned int arg) > > +static int loop_set_fd_and_status(struct loop_device *lo, fmode_t mode, > > + struct block_device *bdev, unsigned int fd, > > + const struct loop_info64 *info) > > { > > struct file *file; > > struct inode *inode; > > @@ -1081,7 +1082,7 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode, > > __module_get(THIS_MODULE); > > > > error = -EBADF; > > - file = fget(arg); > > + file = fget(fd); > > if (!file) > > goto out; > > > > @@ -1090,7 +1091,7 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode, > > * here to avoid changing device under exclusive owner. > > */ > > if (!(mode & FMODE_EXCL)) { > > - claimed_bdev = bd_start_claiming(bdev, loop_set_fd); > > + claimed_bdev = bd_start_claiming(bdev, loop_set_fd_and_status); > > if (IS_ERR(claimed_bdev)) { > > error = PTR_ERR(claimed_bdev); > > goto out_putf; > > @@ -1117,9 +1118,24 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode, > > lo_flags |= LO_FLAGS_READ_ONLY; > > > > error = -EFBIG; > > - size = get_loop_size(lo, file); > > + if (info) > > + size = get_size(info->lo_offset, info->lo_sizelimit, > > + file); > > + else > > + size = get_loop_size(lo, file); > > if ((loff_t)(sector_t)size != size) > > goto out_unlock; > > + > > + if (info) { > > + error = loop_set_from_status(lo, info); > > + if (error) > > + goto out_unlock; > > + } else { > > + lo->transfer = NULL; > > + lo->ioctl = NULL; > > + lo->lo_sizelimit = 0; > > + lo->lo_offset = 0; > > + } > > Just curious: Can't we just pass in an on-stack info for the legacy > case and avoid all these conditionals? Yeah, that is actually much nicer. I will rework it to that. > Can you throw in another prep patch that adds a: > > void __user *argp = (void __user *)arg; > > line at the top of lo_compat_ioctl, and switches the LOOP_SET_STATUS > and LOOP_GET_STATUS case to it? Did you mean in regular lo_ioctl()? eg something like this: @@ -1671,6 +1671,7 @@ static int lo_ioctl(struct block_device *bdev, fmode_t mode, unsigned int cmd, unsigned long arg) { struct loop_device *lo = bdev->bd_disk->private_data; + void __user *argp = (void __user *) arg; int err; switch (cmd) { @@ -1694,21 +1695,19 @@ static int lo_ioctl(struct block_device *bdev, fmode_t mode, case LOOP_SET_STATUS: err = -EPERM; if ((mode & FMODE_WRITE) || capable(CAP_SYS_ADMIN)) { - err = loop_set_status_old(lo, - (struct loop_info __user *)arg); + err = loop_set_status_old(lo, argp); } break; case LOOP_GET_STATUS: - return loop_get_status_old(lo, (struct loop_info __user *) arg); + return loop_get_status_old(lo, argp); case LOOP_SET_STATUS64: err = -EPERM; if ((mode & FMODE_WRITE) || capable(CAP_SYS_ADMIN)) { - err = loop_set_status64(lo, - (struct loop_info64 __user *) arg); + err = loop_set_status64(lo, argp); } break; case LOOP_GET_STATUS64: - return loop_get_status64(lo, (struct loop_info64 __user *) arg); + return loop_get_status64(lo, argp); > > +struct loop_fd_and_status { > > + struct loop_info64 info; > > + __u32 fd; > > This should grow a > > __u32 __pad; > > to avoid different struct sizes on x86-32 vs x86-64. will do, thanks!
On Wed, Apr 22, 2020 at 10:06:17AM +0200, Martijn Coenen wrote: > > line at the top of lo_compat_ioctl, and switches the LOOP_SET_STATUS > > and LOOP_GET_STATUS case to it? > > Did you mean in regular lo_ioctl()? Yes, sorry. > eg something like this: > > @@ -1671,6 +1671,7 @@ static int lo_ioctl(struct block_device *bdev, > fmode_t mode, > unsigned int cmd, unsigned long arg) > { > struct loop_device *lo = bdev->bd_disk->private_data; > + void __user *argp = (void __user *) arg; > int err; > > switch (cmd) { > @@ -1694,21 +1695,19 @@ static int lo_ioctl(struct block_device *bdev, > fmode_t mode, > case LOOP_SET_STATUS: > err = -EPERM; > if ((mode & FMODE_WRITE) || capable(CAP_SYS_ADMIN)) { > - err = loop_set_status_old(lo, > - (struct loop_info __user *)arg); > + err = loop_set_status_old(lo, argp); > } > break; > case LOOP_GET_STATUS: > - return loop_get_status_old(lo, (struct loop_info __user *) arg); > + return loop_get_status_old(lo, argp); > case LOOP_SET_STATUS64: > err = -EPERM; > if ((mode & FMODE_WRITE) || capable(CAP_SYS_ADMIN)) { > - err = loop_set_status64(lo, > - (struct loop_info64 __user *) arg); > + err = loop_set_status64(lo, argp); > } > break; > case LOOP_GET_STATUS64: > - return loop_get_status64(lo, (struct loop_info64 __user *) arg); > + return loop_get_status64(lo, argp); Exactly!
On 4/20/20 1:04 AM, Martijn Coenen wrote: > diff --git a/include/uapi/linux/loop.h b/include/uapi/linux/loop.h > index 080a8df134ef..fcc9a693b588 100644 > --- a/include/uapi/linux/loop.h > +++ b/include/uapi/linux/loop.h > @@ -60,6 +60,11 @@ struct loop_info64 { > __u64 lo_init[2]; > }; > > +struct loop_fd_and_status { > + struct loop_info64 info; > + __u32 fd; > +}; > + > /* > * Loop filter types > */ > @@ -90,6 +95,7 @@ struct loop_info64 { > #define LOOP_SET_CAPACITY 0x4C07 > #define LOOP_SET_DIRECT_IO 0x4C08 > #define LOOP_SET_BLOCK_SIZE 0x4C09 > +#define LOOP_SET_FD_AND_STATUS 0x4C0A > > /* /dev/loop-control interface */ > #define LOOP_CTL_ADD 0x4C80 Should linux-api be Cc'ed for this patch or the entire patch series? From https://www.kernel.org/doc/man-pages/linux-api-ml.html: "Among other things, a primary goal of the list is to help answer the question: How do we even know when an interface has been added or changed?". Thanks, Bart.
On Wed, Apr 22, 2020 at 5:10 PM Bart Van Assche <bvanassche@acm.org> wrote: > Should linux-api be Cc'ed for this patch or the entire patch series? > From https://www.kernel.org/doc/man-pages/linux-api-ml.html: "Among > other things, a primary goal of the list is to help answer the question: > How do we even know when an interface has been added or changed?". Thanks for the suggestion, will add linux-api in the next series. Martijn > > Thanks, > > Bart. > >
diff --git a/drivers/block/loop.c b/drivers/block/loop.c index 6e656390b285..e1dbd70d6d6e 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -1065,8 +1065,9 @@ loop_set_from_status(struct loop_device *lo, const struct loop_info64 *info) return 0; } -static int loop_set_fd(struct loop_device *lo, fmode_t mode, - struct block_device *bdev, unsigned int arg) +static int loop_set_fd_and_status(struct loop_device *lo, fmode_t mode, + struct block_device *bdev, unsigned int fd, + const struct loop_info64 *info) { struct file *file; struct inode *inode; @@ -1081,7 +1082,7 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode, __module_get(THIS_MODULE); error = -EBADF; - file = fget(arg); + file = fget(fd); if (!file) goto out; @@ -1090,7 +1091,7 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode, * here to avoid changing device under exclusive owner. */ if (!(mode & FMODE_EXCL)) { - claimed_bdev = bd_start_claiming(bdev, loop_set_fd); + claimed_bdev = bd_start_claiming(bdev, loop_set_fd_and_status); if (IS_ERR(claimed_bdev)) { error = PTR_ERR(claimed_bdev); goto out_putf; @@ -1117,9 +1118,24 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode, lo_flags |= LO_FLAGS_READ_ONLY; error = -EFBIG; - size = get_loop_size(lo, file); + if (info) + size = get_size(info->lo_offset, info->lo_sizelimit, + file); + else + size = get_loop_size(lo, file); if ((loff_t)(sector_t)size != size) goto out_unlock; + + if (info) { + error = loop_set_from_status(lo, info); + if (error) + goto out_unlock; + } else { + lo->transfer = NULL; + lo->ioctl = NULL; + lo->lo_sizelimit = 0; + lo->lo_offset = 0; + } error = loop_prepare_queue(lo); if (error) goto out_unlock; @@ -1132,9 +1148,6 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode, lo->lo_device = bdev; lo->lo_flags = lo_flags; lo->lo_backing_file = file; - lo->transfer = NULL; - lo->ioctl = NULL; - lo->lo_sizelimit = 0; lo->old_gfp_mask = mapping_gfp_mask(mapping); mapping_set_gfp_mask(mapping, lo->old_gfp_mask & ~(__GFP_IO|__GFP_FS)); @@ -1172,14 +1185,14 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode, if (partscan) loop_reread_partitions(lo, bdev); if (claimed_bdev) - bd_abort_claiming(bdev, claimed_bdev, loop_set_fd); + bd_abort_claiming(bdev, claimed_bdev, loop_set_fd_and_status); return 0; out_unlock: mutex_unlock(&loop_ctl_mutex); out_bdev: if (claimed_bdev) - bd_abort_claiming(bdev, claimed_bdev, loop_set_fd); + bd_abort_claiming(bdev, claimed_bdev, loop_set_fd_and_status); out_putf: fput(file); out: @@ -1653,7 +1666,18 @@ static int lo_ioctl(struct block_device *bdev, fmode_t mode, switch (cmd) { case LOOP_SET_FD: - return loop_set_fd(lo, mode, bdev, arg); + return loop_set_fd_and_status(lo, mode, bdev, arg, NULL); + case LOOP_SET_FD_AND_STATUS: { + struct loop_fd_and_status fds; + + if (copy_from_user(&fds, + (struct loop_fd_and_status __user *) arg, + sizeof(fds))) + return -EFAULT; + + return loop_set_fd_and_status(lo, mode, bdev, fds.fd, + &fds.info); + } case LOOP_CHANGE_FD: return loop_change_fd(lo, bdev, arg); case LOOP_CLR_FD: @@ -1827,6 +1851,7 @@ static int lo_compat_ioctl(struct block_device *bdev, fmode_t mode, case LOOP_CLR_FD: case LOOP_GET_STATUS64: case LOOP_SET_STATUS64: + case LOOP_SET_FD_AND_STATUS: arg = (unsigned long) compat_ptr(arg); /* fall through */ case LOOP_SET_FD: diff --git a/include/uapi/linux/loop.h b/include/uapi/linux/loop.h index 080a8df134ef..fcc9a693b588 100644 --- a/include/uapi/linux/loop.h +++ b/include/uapi/linux/loop.h @@ -60,6 +60,11 @@ struct loop_info64 { __u64 lo_init[2]; }; +struct loop_fd_and_status { + struct loop_info64 info; + __u32 fd; +}; + /* * Loop filter types */ @@ -90,6 +95,7 @@ struct loop_info64 { #define LOOP_SET_CAPACITY 0x4C07 #define LOOP_SET_DIRECT_IO 0x4C08 #define LOOP_SET_BLOCK_SIZE 0x4C09 +#define LOOP_SET_FD_AND_STATUS 0x4C0A /* /dev/loop-control interface */ #define LOOP_CTL_ADD 0x4C80
This allows userspace to completely setup a loop device with a single ioctl, removing the in-between state where the device can be partially configured - eg the loop device has a backing file associated with it, but is reading from the wrong offset. Besides removing the intermediate state, another big benefit of this ioctl is that LOOP_SET_STATUS can be slow; the main reason for this slowness is that LOOP_SET_STATUS(64) calls blk_mq_freeze_queue() to freeze the associated queue; this requires waiting for RCU synchronization, which I've measured can take about 15-20ms on this device on average. Here's setting up ~70 regular loop devices with an offset on an x86 Android device, using LOOP_SET_FD and LOOP_SET_STATUS: vsoc_x86:/system/apex # time for i in `seq 30 100`; do losetup -r -o 4096 /dev/block/loop$i com.android.adbd.apex; done 0m03.40s real 0m00.02s user 0m00.03s system Here's configuring ~70 devices in the same way, but using a modified losetup that uses the new LOOP_SET_FD_AND_STATUS ioctl: vsoc_x86:/system/apex # time for i in `seq 30 100`; do losetup -r -o 4096 /dev/block/loop$i com.android.adbd.apex; done 0m01.94s real 0m00.01s user 0m00.01s system Signed-off-by: Martijn Coenen <maco@android.com> --- drivers/block/loop.c | 47 ++++++++++++++++++++++++++++++--------- include/uapi/linux/loop.h | 6 +++++ 2 files changed, 42 insertions(+), 11 deletions(-)