Message ID | 20180322073822.25795-1-famz@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 22/03/2018 08:38, Fam Zheng wrote: > Some backends report big max_io_sectors. Making min_io_size the same > value in this case will make it impossible for guest to align memory, > therefore the disk may not be usable at all. > > Change the default behavior (when min_io_size and opt_io_size are not > specified in the command line), do not assume max_io_sectors is a good > value for opt_io_size and min_io_size, use 512 instead. > > Reported-by: David Gibson <dgibson@redhat.com> > Signed-off-by: Fam Zheng <famz@redhat.com> > --- > hw/scsi/scsi-disk.c | 6 ++---- > 1 file changed, 2 insertions(+), 4 deletions(-) > > diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c > index 5b7a48f5a5..76e3c9eaa4 100644 > --- a/hw/scsi/scsi-disk.c > +++ b/hw/scsi/scsi-disk.c > @@ -714,10 +714,8 @@ static int scsi_disk_emulate_inquiry(SCSIRequest *req, uint8_t *outbuf) > > /* min_io_size and opt_io_size can't be greater than > * max_io_sectors */ > - min_io_size = > - MIN_NON_ZERO(min_io_size, max_io_sectors); > - opt_io_size = > - MIN_NON_ZERO(opt_io_size, max_io_sectors); > + min_io_size = MIN(min_io_size ? : 512, max_io_sectors); > + opt_io_size = MIN(opt_io_size ? : 512, max_io_sectors); There are a few easily fixed issues with your chosen defaults, though the problem obviously makes sense: 1) the values are in sectors - since you chose 512, it's not clear if you meant it to be 512 bytes or 512 sectors. :) 512 sectors (256 KiB or 2 MiB depending on logical block size) is still too much for the min_io_size. The min_io_size default (if it is 0) is the physical block size, so I think we should make the min_io_size either 0 or the physical block size. 2) For the opt_io_size, 256 KiB on the other hand is probably too little. On my laptop (NVMe disk) a transfer size of 8 MiB is twice as fast compared to a transfer size of 256 KiB, and 16 MiB or 32 MiB is a little faster too. I would either leave zero as the default, or pick something around 16-32 MiB. Thanks, Paolo > } > /* required VPD size with unmap support */ > buflen = 0x40; >
Hi, On 03/22/2018 04:38 AM, Fam Zheng wrote: > Some backends report big max_io_sectors. Making min_io_size the same > value in this case will make it impossible for guest to align memory, > therefore the disk may not be usable at all. > > Change the default behavior (when min_io_size and opt_io_size are not > specified in the command line), do not assume max_io_sectors is a good > value for opt_io_size and min_io_size, use 512 instead. > > Reported-by: David Gibson <dgibson@redhat.com> > Signed-off-by: Fam Zheng <famz@redhat.com> > --- > hw/scsi/scsi-disk.c | 6 ++---- > 1 file changed, 2 insertions(+), 4 deletions(-) > > diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c > index 5b7a48f5a5..76e3c9eaa4 100644 > --- a/hw/scsi/scsi-disk.c > +++ b/hw/scsi/scsi-disk.c > @@ -714,10 +714,8 @@ static int scsi_disk_emulate_inquiry(SCSIRequest *req, uint8_t *outbuf) > > /* min_io_size and opt_io_size can't be greater than > * max_io_sectors */ > - min_io_size = > - MIN_NON_ZERO(min_io_size, max_io_sectors); > - opt_io_size = > - MIN_NON_ZERO(opt_io_size, max_io_sectors); > + min_io_size = MIN(min_io_size ? : 512, max_io_sectors); > + opt_io_size = MIN(opt_io_size ? : 512, max_io_sectors); > } This code you're changing was added in d082d16a5c ("consider bl->max_transfer .."). I've borrowed this logic from scsi-generic.c, scsi_read_complete: if (s->type == TYPE_DISK && r->req.cmd.buf[0] == INQUIRY && r->req.cmd.buf[2] == 0xb0) { uint32_t max_transfer = blk_get_max_transfer(s->conf.blk) / s->blocksize; assert(max_transfer); stl_be_p(&r->buf[8], max_transfer); /* Also take care of the opt xfer len. */ stl_be_p(&r->buf[12], MIN_NON_ZERO(max_transfer, ldl_be_p(&r->buf[12]))); } Unless I've misunderstood the bug, you will want to change this code too. Otherwise you'll fix it with emulated disks but it might appear when using SCSI passthrough. Thanks, Daniel > /* required VPD size with unmap support */ > buflen = 0x40;
On Thu, 03/22 09:19, Daniel Henrique Barboza wrote: > Hi, > > On 03/22/2018 04:38 AM, Fam Zheng wrote: > > Some backends report big max_io_sectors. Making min_io_size the same > > value in this case will make it impossible for guest to align memory, > > therefore the disk may not be usable at all. > > > > Change the default behavior (when min_io_size and opt_io_size are not > > specified in the command line), do not assume max_io_sectors is a good > > value for opt_io_size and min_io_size, use 512 instead. > > > > Reported-by: David Gibson <dgibson@redhat.com> > > Signed-off-by: Fam Zheng <famz@redhat.com> > > --- > > hw/scsi/scsi-disk.c | 6 ++---- > > 1 file changed, 2 insertions(+), 4 deletions(-) > > > > diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c > > index 5b7a48f5a5..76e3c9eaa4 100644 > > --- a/hw/scsi/scsi-disk.c > > +++ b/hw/scsi/scsi-disk.c > > @@ -714,10 +714,8 @@ static int scsi_disk_emulate_inquiry(SCSIRequest *req, uint8_t *outbuf) > > > > /* min_io_size and opt_io_size can't be greater than > > * max_io_sectors */ > > - min_io_size = > > - MIN_NON_ZERO(min_io_size, max_io_sectors); > > - opt_io_size = > > - MIN_NON_ZERO(opt_io_size, max_io_sectors); > > + min_io_size = MIN(min_io_size ? : 512, max_io_sectors); > > + opt_io_size = MIN(opt_io_size ? : 512, max_io_sectors); > > } > > This code you're changing was added in d082d16a5c ("consider > bl->max_transfer .."). > I've borrowed this logic from scsi-generic.c, scsi_read_complete: > > if (s->type == TYPE_DISK && > r->req.cmd.buf[0] == INQUIRY && > r->req.cmd.buf[2] == 0xb0) { > uint32_t max_transfer = > blk_get_max_transfer(s->conf.blk) / s->blocksize; > > assert(max_transfer); > stl_be_p(&r->buf[8], max_transfer); > /* Also take care of the opt xfer len. */ > stl_be_p(&r->buf[12], > MIN_NON_ZERO(max_transfer, ldl_be_p(&r->buf[12]))); > } > > > Unless I've misunderstood the bug, you will want to change this code too. > Otherwise > you'll fix it with emulated disks but it might appear when using SCSI > passthrough. I am assuming (because I don't have a reproducer myself) what matters is min_io_size here. David, could you help test if you see the same problem with "-device scsi-block"? If we I'll patch scsi-generic.c in v2 too. Fam
On Mon, 26 Mar 2018 15:26:39 +0800 Fam Zheng <famz@redhat.com> wrote: > On Thu, 03/22 09:19, Daniel Henrique Barboza wrote: > > Hi, > > > > On 03/22/2018 04:38 AM, Fam Zheng wrote: > > > Some backends report big max_io_sectors. Making min_io_size the same > > > value in this case will make it impossible for guest to align memory, > > > therefore the disk may not be usable at all. > > > > > > Change the default behavior (when min_io_size and opt_io_size are not > > > specified in the command line), do not assume max_io_sectors is a good > > > value for opt_io_size and min_io_size, use 512 instead. > > > > > > Reported-by: David Gibson <dgibson@redhat.com> > > > Signed-off-by: Fam Zheng <famz@redhat.com> > > > --- > > > hw/scsi/scsi-disk.c | 6 ++---- > > > 1 file changed, 2 insertions(+), 4 deletions(-) > > > > > > diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c > > > index 5b7a48f5a5..76e3c9eaa4 100644 > > > --- a/hw/scsi/scsi-disk.c > > > +++ b/hw/scsi/scsi-disk.c > > > @@ -714,10 +714,8 @@ static int scsi_disk_emulate_inquiry(SCSIRequest *req, uint8_t *outbuf) > > > > > > /* min_io_size and opt_io_size can't be greater than > > > * max_io_sectors */ > > > - min_io_size = > > > - MIN_NON_ZERO(min_io_size, max_io_sectors); > > > - opt_io_size = > > > - MIN_NON_ZERO(opt_io_size, max_io_sectors); > > > + min_io_size = MIN(min_io_size ? : 512, max_io_sectors); > > > + opt_io_size = MIN(opt_io_size ? : 512, max_io_sectors); > > > } > > > > This code you're changing was added in d082d16a5c ("consider > > bl->max_transfer .."). > > I've borrowed this logic from scsi-generic.c, scsi_read_complete: > > > > if (s->type == TYPE_DISK && > > r->req.cmd.buf[0] == INQUIRY && > > r->req.cmd.buf[2] == 0xb0) { > > uint32_t max_transfer = > > blk_get_max_transfer(s->conf.blk) / s->blocksize; > > > > assert(max_transfer); > > stl_be_p(&r->buf[8], max_transfer); > > /* Also take care of the opt xfer len. */ > > stl_be_p(&r->buf[12], > > MIN_NON_ZERO(max_transfer, ldl_be_p(&r->buf[12]))); > > } > > > > > > Unless I've misunderstood the bug, you will want to change this code too. > > Otherwise > > you'll fix it with emulated disks but it might appear when using SCSI > > passthrough. > > I am assuming (because I don't have a reproducer myself) Sorry, I should have given you specific reproduce instructions. You don't need a POWER host - I've verified that the bug trips under TCG. 1. Grab a RHEL ppc64le install image (other installers could well also hit it, but I haven't tried them) 2. Build current qemu master, including the ppc64-softmmu target 3. Create a fresh new guest disk image qemu-img create -f qcow2 disk.qcow2 20G 4. Attempt to install the new guest: $QEMU -nodefaults -nographic -machine pseries \ -cpu POWER8 -smp 1 -m 1G \ -chardev stdio,id=conmon,mux=on,signal=off \ -device spapr-vty,chardev=conmon \ -mon conmon \ -device virtio-scsi-pci,id=scsi \ -drive file=disk.qcow2,if=none,format=qcow2,id=hd \ -device scsi-disk,drive=hd,bus=scsi.0 \ -drive file=RHEL-7.4-20170711.0-Server-ppc64le-dvd1.iso,format=raw,media=cdrom,if=none,id=cd \ -device scsi-cd,drive=cd,bus=scsi.0 That's using the RHEL7.4 GA image, a recent 7.5 snapshot also works as may others. > what matters is > min_io_size here. > > David, could you help test if you see the same problem with "-device > scsi-block"? If we I'll patch scsi-generic.c in v2 too. I'm not sure exactly what you want me to check here? You mean putting the guest disk on a scsi-block instead of scsi-disk? That's a bit more fiddly, since I have to find a block device to back it instead of an image.
On Tue, 03/27 14:44, David Gibson wrote: > On Mon, 26 Mar 2018 15:26:39 +0800 > Fam Zheng <famz@redhat.com> wrote: > > > On Thu, 03/22 09:19, Daniel Henrique Barboza wrote: > > > Hi, > > > > > > On 03/22/2018 04:38 AM, Fam Zheng wrote: > > > > Some backends report big max_io_sectors. Making min_io_size the same > > > > value in this case will make it impossible for guest to align memory, > > > > therefore the disk may not be usable at all. > > > > > > > > Change the default behavior (when min_io_size and opt_io_size are not > > > > specified in the command line), do not assume max_io_sectors is a good > > > > value for opt_io_size and min_io_size, use 512 instead. > > > > > > > > Reported-by: David Gibson <dgibson@redhat.com> > > > > Signed-off-by: Fam Zheng <famz@redhat.com> > > > > --- > > > > hw/scsi/scsi-disk.c | 6 ++---- > > > > 1 file changed, 2 insertions(+), 4 deletions(-) > > > > > > > > diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c > > > > index 5b7a48f5a5..76e3c9eaa4 100644 > > > > --- a/hw/scsi/scsi-disk.c > > > > +++ b/hw/scsi/scsi-disk.c > > > > @@ -714,10 +714,8 @@ static int scsi_disk_emulate_inquiry(SCSIRequest *req, uint8_t *outbuf) > > > > > > > > /* min_io_size and opt_io_size can't be greater than > > > > * max_io_sectors */ > > > > - min_io_size = > > > > - MIN_NON_ZERO(min_io_size, max_io_sectors); > > > > - opt_io_size = > > > > - MIN_NON_ZERO(opt_io_size, max_io_sectors); > > > > + min_io_size = MIN(min_io_size ? : 512, max_io_sectors); > > > > + opt_io_size = MIN(opt_io_size ? : 512, max_io_sectors); > > > > } > > > > > > This code you're changing was added in d082d16a5c ("consider > > > bl->max_transfer .."). > > > I've borrowed this logic from scsi-generic.c, scsi_read_complete: > > > > > > if (s->type == TYPE_DISK && > > > r->req.cmd.buf[0] == INQUIRY && > > > r->req.cmd.buf[2] == 0xb0) { > > > uint32_t max_transfer = > > > blk_get_max_transfer(s->conf.blk) / s->blocksize; > > > > > > assert(max_transfer); > > > stl_be_p(&r->buf[8], max_transfer); > > > /* Also take care of the opt xfer len. */ > > > stl_be_p(&r->buf[12], > > > MIN_NON_ZERO(max_transfer, ldl_be_p(&r->buf[12]))); > > > } > > > > > > > > > Unless I've misunderstood the bug, you will want to change this code too. > > > Otherwise > > > you'll fix it with emulated disks but it might appear when using SCSI > > > passthrough. > > > > I am assuming (because I don't have a reproducer myself) > > Sorry, I should have given you specific reproduce instructions. You > don't need a POWER host - I've verified that the bug trips under TCG. > > 1. Grab a RHEL ppc64le install image (other installers could well > also hit it, but I haven't tried them) > 2. Build current qemu master, including the ppc64-softmmu target > 3. Create a fresh new guest disk image > qemu-img create -f qcow2 disk.qcow2 20G > 4. Attempt to install the new guest: > $QEMU -nodefaults -nographic -machine pseries \ > -cpu POWER8 -smp 1 -m 1G \ > -chardev stdio,id=conmon,mux=on,signal=off \ > -device spapr-vty,chardev=conmon \ > -mon conmon \ > -device virtio-scsi-pci,id=scsi \ > -drive file=disk.qcow2,if=none,format=qcow2,id=hd \ > -device scsi-disk,drive=hd,bus=scsi.0 \ > -drive file=RHEL-7.4-20170711.0-Server-ppc64le-dvd1.iso,format=raw,media=cdrom,if=none,id=cd \ > -device scsi-cd,drive=cd,bus=scsi.0 > > That's using the RHEL7.4 GA image, a recent 7.5 snapshot also works as may others. Thanks, your reproducer works. So I've verified that fixing min_io_size alone will eliminate the problem. So there is no such problem for scsi-block. Of course aligning opt_io_size up to max_io_size is dubious but as far as fixing guest I/O, I think touching up scsi-disk is okay. I'll address Paolo's comments and post v2. Fam
diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c index 5b7a48f5a5..76e3c9eaa4 100644 --- a/hw/scsi/scsi-disk.c +++ b/hw/scsi/scsi-disk.c @@ -714,10 +714,8 @@ static int scsi_disk_emulate_inquiry(SCSIRequest *req, uint8_t *outbuf) /* min_io_size and opt_io_size can't be greater than * max_io_sectors */ - min_io_size = - MIN_NON_ZERO(min_io_size, max_io_sectors); - opt_io_size = - MIN_NON_ZERO(opt_io_size, max_io_sectors); + min_io_size = MIN(min_io_size ? : 512, max_io_sectors); + opt_io_size = MIN(opt_io_size ? : 512, max_io_sectors); } /* required VPD size with unmap support */ buflen = 0x40;
Some backends report big max_io_sectors. Making min_io_size the same value in this case will make it impossible for guest to align memory, therefore the disk may not be usable at all. Change the default behavior (when min_io_size and opt_io_size are not specified in the command line), do not assume max_io_sectors is a good value for opt_io_size and min_io_size, use 512 instead. Reported-by: David Gibson <dgibson@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> --- hw/scsi/scsi-disk.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)