Message ID | 20170203195037.4238-1-nirsof@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Ping On Fri, Feb 3, 2017 at 9:50 PM, Nir Soffer <nirsof@gmail.com> wrote: > When using file system that does not support fallocate() (e.g. NFS < > 4.2), truncating the file only when preallocation=OFF speeds up creating > raw file. > > Here is example run, tested on Fedora 24 machine, creating raw file on > NFS version 3 server. > > $ time ./qemu-img-master create -f raw -o preallocation=falloc mnt/test 1g > Formatting 'mnt/test', fmt=raw size=1073741824 preallocation=falloc > > real 0m21.185s > user 0m0.022s > sys 0m0.574s > > $ time ./qemu-img-fix create -f raw -o preallocation=falloc mnt/test 1g > Formatting 'mnt/test', fmt=raw size=1073741824 preallocation=falloc > > real 0m11.601s > user 0m0.016s > sys 0m0.525s > > $ time dd if=/dev/zero of=mnt/test bs=1M count=1024 oflag=direct > 1024+0 records in > 1024+0 records out > 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 15.6627 s, 68.6 MB/s > > real 0m16.104s > user 0m0.009s > sys 0m0.220s > > Running with strace we can see that without this change we do one > pread() and one pwrite() for each block. With this change, we do only > one pwrite() per block. > > $ strace ./qemu-img-master create -f raw -o preallocation=falloc mnt/test 8192 > ... > pread64(9, "\0", 1, 4095) = 1 > pwrite64(9, "\0", 1, 4095) = 1 > pread64(9, "\0", 1, 8191) = 1 > pwrite64(9, "\0", 1, 8191) = 1 > > $ strace ./qemu-img-fix create -f raw -o preallocation=falloc mnt/test 8192 > ... > pwrite64(9, "\0", 1, 4095) = 1 > pwrite64(9, "\0", 1, 8191) = 1 > > This happens because posix_fallocate is checking if each block is > allocated before writing a byte to the block, and when truncating the > file before preallocation, all blocks are unallocated. > > Signed-off-by: Nir Soffer <nirsof@gmail.com> > --- > > I sent this a week ago: > http://lists.nongnu.org/archive/html/qemu-devel/2017-01/msg06123.html > > Sending again with improved commit message. > > block/file-posix.c | 11 ++++------- > 1 file changed, 4 insertions(+), 7 deletions(-) > > diff --git a/block/file-posix.c b/block/file-posix.c > index 2134e0e..442f080 100644 > --- a/block/file-posix.c > +++ b/block/file-posix.c > @@ -1591,12 +1591,6 @@ static int raw_create(const char *filename, QemuOpts *opts, Error **errp) > #endif > } > > - if (ftruncate(fd, total_size) != 0) { > - result = -errno; > - error_setg_errno(errp, -result, "Could not resize file"); > - goto out_close; > - } > - > switch (prealloc) { > #ifdef CONFIG_POSIX_FALLOCATE > case PREALLOC_MODE_FALLOC: > @@ -1636,6 +1630,10 @@ static int raw_create(const char *filename, QemuOpts *opts, Error **errp) > break; > } > case PREALLOC_MODE_OFF: > + if (ftruncate(fd, total_size) != 0) { > + result = -errno; > + error_setg_errno(errp, -result, "Could not resize file"); > + } > break; > default: > result = -EINVAL; > @@ -1644,7 +1642,6 @@ static int raw_create(const char *filename, QemuOpts *opts, Error **errp) > break; > } > > -out_close: > if (qemu_close(fd) != 0 && result == 0) { > result = -errno; > error_setg_errno(errp, -result, "Could not close the new file"); > -- > 2.9.3 >
Am 03.02.2017 um 20:50 hat Nir Soffer geschrieben: > When using file system that does not support fallocate() (e.g. NFS < > 4.2), truncating the file only when preallocation=OFF speeds up creating > raw file. > > Here is example run, tested on Fedora 24 machine, creating raw file on > NFS version 3 server. > > $ time ./qemu-img-master create -f raw -o preallocation=falloc mnt/test 1g > Formatting 'mnt/test', fmt=raw size=1073741824 preallocation=falloc > > real 0m21.185s > user 0m0.022s > sys 0m0.574s > > $ time ./qemu-img-fix create -f raw -o preallocation=falloc mnt/test 1g > Formatting 'mnt/test', fmt=raw size=1073741824 preallocation=falloc > > real 0m11.601s > user 0m0.016s > sys 0m0.525s > > $ time dd if=/dev/zero of=mnt/test bs=1M count=1024 oflag=direct > 1024+0 records in > 1024+0 records out > 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 15.6627 s, 68.6 MB/s > > real 0m16.104s > user 0m0.009s > sys 0m0.220s > > Running with strace we can see that without this change we do one > pread() and one pwrite() for each block. With this change, we do only > one pwrite() per block. > > $ strace ./qemu-img-master create -f raw -o preallocation=falloc mnt/test 8192 > ... > pread64(9, "\0", 1, 4095) = 1 > pwrite64(9, "\0", 1, 4095) = 1 > pread64(9, "\0", 1, 8191) = 1 > pwrite64(9, "\0", 1, 8191) = 1 > > $ strace ./qemu-img-fix create -f raw -o preallocation=falloc mnt/test 8192 > ... > pwrite64(9, "\0", 1, 4095) = 1 > pwrite64(9, "\0", 1, 8191) = 1 > > This happens because posix_fallocate is checking if each block is > allocated before writing a byte to the block, and when truncating the > file before preallocation, all blocks are unallocated. > > Signed-off-by: Nir Soffer <nirsof@gmail.com> Thanks, applied to the block branch. I'm not completely sure if doing an ftruncate() first couldn't improve PREALLOC_MODE_FULL somewhat in some cases, but I agree that the patch should still result in correct images. KEvin
On Thu, Feb 16, 2017 at 7:52 PM, Kevin Wolf <kwolf@redhat.com> wrote: > Am 03.02.2017 um 20:50 hat Nir Soffer geschrieben: >> When using file system that does not support fallocate() (e.g. NFS < >> 4.2), truncating the file only when preallocation=OFF speeds up creating >> raw file. >> >> Here is example run, tested on Fedora 24 machine, creating raw file on >> NFS version 3 server. >> >> $ time ./qemu-img-master create -f raw -o preallocation=falloc mnt/test 1g >> Formatting 'mnt/test', fmt=raw size=1073741824 preallocation=falloc >> >> real 0m21.185s >> user 0m0.022s >> sys 0m0.574s >> >> $ time ./qemu-img-fix create -f raw -o preallocation=falloc mnt/test 1g >> Formatting 'mnt/test', fmt=raw size=1073741824 preallocation=falloc >> >> real 0m11.601s >> user 0m0.016s >> sys 0m0.525s >> >> $ time dd if=/dev/zero of=mnt/test bs=1M count=1024 oflag=direct >> 1024+0 records in >> 1024+0 records out >> 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 15.6627 s, 68.6 MB/s >> >> real 0m16.104s >> user 0m0.009s >> sys 0m0.220s >> >> Running with strace we can see that without this change we do one >> pread() and one pwrite() for each block. With this change, we do only >> one pwrite() per block. >> >> $ strace ./qemu-img-master create -f raw -o preallocation=falloc mnt/test 8192 >> ... >> pread64(9, "\0", 1, 4095) = 1 >> pwrite64(9, "\0", 1, 4095) = 1 >> pread64(9, "\0", 1, 8191) = 1 >> pwrite64(9, "\0", 1, 8191) = 1 >> >> $ strace ./qemu-img-fix create -f raw -o preallocation=falloc mnt/test 8192 >> ... >> pwrite64(9, "\0", 1, 4095) = 1 >> pwrite64(9, "\0", 1, 8191) = 1 >> >> This happens because posix_fallocate is checking if each block is >> allocated before writing a byte to the block, and when truncating the >> file before preallocation, all blocks are unallocated. >> >> Signed-off-by: Nir Soffer <nirsof@gmail.com> > > Thanks, applied to the block branch. > > I'm not completely sure if doing an ftruncate() first couldn't improve > PREALLOC_MODE_FULL somewhat in some cases, but I agree that the patch > should still result in correct images. Good point, I'll do some tests with full mode to check this. Do you know which cases can benefit from ftruncate() before full preallocation? Nir
Am 16.02.2017 um 19:23 hat Nir Soffer geschrieben: > On Thu, Feb 16, 2017 at 7:52 PM, Kevin Wolf <kwolf@redhat.com> wrote: > > Am 03.02.2017 um 20:50 hat Nir Soffer geschrieben: > >> When using file system that does not support fallocate() (e.g. NFS < > >> 4.2), truncating the file only when preallocation=OFF speeds up creating > >> raw file. > >> > >> Here is example run, tested on Fedora 24 machine, creating raw file on > >> NFS version 3 server. > >> > >> $ time ./qemu-img-master create -f raw -o preallocation=falloc mnt/test 1g > >> Formatting 'mnt/test', fmt=raw size=1073741824 preallocation=falloc > >> > >> real 0m21.185s > >> user 0m0.022s > >> sys 0m0.574s > >> > >> $ time ./qemu-img-fix create -f raw -o preallocation=falloc mnt/test 1g > >> Formatting 'mnt/test', fmt=raw size=1073741824 preallocation=falloc > >> > >> real 0m11.601s > >> user 0m0.016s > >> sys 0m0.525s > >> > >> $ time dd if=/dev/zero of=mnt/test bs=1M count=1024 oflag=direct > >> 1024+0 records in > >> 1024+0 records out > >> 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 15.6627 s, 68.6 MB/s > >> > >> real 0m16.104s > >> user 0m0.009s > >> sys 0m0.220s > >> > >> Running with strace we can see that without this change we do one > >> pread() and one pwrite() for each block. With this change, we do only > >> one pwrite() per block. > >> > >> $ strace ./qemu-img-master create -f raw -o preallocation=falloc mnt/test 8192 > >> ... > >> pread64(9, "\0", 1, 4095) = 1 > >> pwrite64(9, "\0", 1, 4095) = 1 > >> pread64(9, "\0", 1, 8191) = 1 > >> pwrite64(9, "\0", 1, 8191) = 1 > >> > >> $ strace ./qemu-img-fix create -f raw -o preallocation=falloc mnt/test 8192 > >> ... > >> pwrite64(9, "\0", 1, 4095) = 1 > >> pwrite64(9, "\0", 1, 8191) = 1 > >> > >> This happens because posix_fallocate is checking if each block is > >> allocated before writing a byte to the block, and when truncating the > >> file before preallocation, all blocks are unallocated. > >> > >> Signed-off-by: Nir Soffer <nirsof@gmail.com> > > > > Thanks, applied to the block branch. > > > > I'm not completely sure if doing an ftruncate() first couldn't improve > > PREALLOC_MODE_FULL somewhat in some cases, but I agree that the patch > > should still result in correct images. > > Good point, I'll do some tests with full mode to check this. > > Do you know which cases can benefit from ftruncate() before full preallocation? Knowing the final size from the beginning could allow the file system driver to do less allocations and possibly avoid fragmentation of the file. I'm not sure if that's easily measurable, though. I'd also expect that with XFS, you'll see no or less preallocation. 'du' often shows values much larger than the actual image size on XFS because the file was growing a lot, so without having the actual size, XFS takes a guess and does some generous preallocation. Kevin
diff --git a/block/file-posix.c b/block/file-posix.c index 2134e0e..442f080 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -1591,12 +1591,6 @@ static int raw_create(const char *filename, QemuOpts *opts, Error **errp) #endif } - if (ftruncate(fd, total_size) != 0) { - result = -errno; - error_setg_errno(errp, -result, "Could not resize file"); - goto out_close; - } - switch (prealloc) { #ifdef CONFIG_POSIX_FALLOCATE case PREALLOC_MODE_FALLOC: @@ -1636,6 +1630,10 @@ static int raw_create(const char *filename, QemuOpts *opts, Error **errp) break; } case PREALLOC_MODE_OFF: + if (ftruncate(fd, total_size) != 0) { + result = -errno; + error_setg_errno(errp, -result, "Could not resize file"); + } break; default: result = -EINVAL; @@ -1644,7 +1642,6 @@ static int raw_create(const char *filename, QemuOpts *opts, Error **errp) break; } -out_close: if (qemu_close(fd) != 0 && result == 0) { result = -errno; error_setg_errno(errp, -result, "Could not close the new file");
When using file system that does not support fallocate() (e.g. NFS < 4.2), truncating the file only when preallocation=OFF speeds up creating raw file. Here is example run, tested on Fedora 24 machine, creating raw file on NFS version 3 server. $ time ./qemu-img-master create -f raw -o preallocation=falloc mnt/test 1g Formatting 'mnt/test', fmt=raw size=1073741824 preallocation=falloc real 0m21.185s user 0m0.022s sys 0m0.574s $ time ./qemu-img-fix create -f raw -o preallocation=falloc mnt/test 1g Formatting 'mnt/test', fmt=raw size=1073741824 preallocation=falloc real 0m11.601s user 0m0.016s sys 0m0.525s $ time dd if=/dev/zero of=mnt/test bs=1M count=1024 oflag=direct 1024+0 records in 1024+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 15.6627 s, 68.6 MB/s real 0m16.104s user 0m0.009s sys 0m0.220s Running with strace we can see that without this change we do one pread() and one pwrite() for each block. With this change, we do only one pwrite() per block. $ strace ./qemu-img-master create -f raw -o preallocation=falloc mnt/test 8192 ... pread64(9, "\0", 1, 4095) = 1 pwrite64(9, "\0", 1, 4095) = 1 pread64(9, "\0", 1, 8191) = 1 pwrite64(9, "\0", 1, 8191) = 1 $ strace ./qemu-img-fix create -f raw -o preallocation=falloc mnt/test 8192 ... pwrite64(9, "\0", 1, 4095) = 1 pwrite64(9, "\0", 1, 8191) = 1 This happens because posix_fallocate is checking if each block is allocated before writing a byte to the block, and when truncating the file before preallocation, all blocks are unallocated. Signed-off-by: Nir Soffer <nirsof@gmail.com> --- I sent this a week ago: http://lists.nongnu.org/archive/html/qemu-devel/2017-01/msg06123.html Sending again with improved commit message. block/file-posix.c | 11 ++++------- 1 file changed, 4 insertions(+), 7 deletions(-)