Message ID | 20200123124357.124019-1-felipe@nutanix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | iscsi: Cap block count from GET LBA STATUS (CVE-2020-1711) | expand |
Am 23.01.2020 um 13:44 hat Felipe Franciosi geschrieben: > When querying an iSCSI server for the provisioning status of blocks (via > GET LBA STATUS), Qemu only validates that the response descriptor zero's > LBA matches the one requested. Given the SCSI spec allows servers to > respond with the status of blocks beyond the end of the LUN, Qemu may > have its heap corrupted by clearing/setting too many bits at the end of > its allocmap for the LUN. > > A malicious guest in control of the iSCSI server could carefully program > Qemu's heap (by selectively setting the bitmap) and then smash it. > > This limits the number of bits that iscsi_co_block_status() will try to > update in the allocmap so it can't overflow the bitmap. > > Signed-off-by: Felipe Franciosi <felipe@nutanix.com> > Signed-off-by: Peter Turschmid <peter.turschm@nutanix.com> > Signed-off-by: Raphael Norwitz <raphael.norwitz@nutanix.com> Thanks, applied to the block branch. Kevin
On 1/23/20 1:44 PM, Felipe Franciosi wrote: > When querying an iSCSI server for the provisioning status of blocks (via > GET LBA STATUS), Qemu only validates that the response descriptor zero's > LBA matches the one requested. Given the SCSI spec allows servers to > respond with the status of blocks beyond the end of the LUN, Qemu may > have its heap corrupted by clearing/setting too many bits at the end of > its allocmap for the LUN. > > A malicious guest in control of the iSCSI server could carefully program > Qemu's heap (by selectively setting the bitmap) and then smash it. > > This limits the number of bits that iscsi_co_block_status() will try to > update in the allocmap so it can't overflow the bitmap. > Please add: Fixes: CVE-2020-1711 (title of CVE if possible) Cc: qemu-stable@nongnu.org > Signed-off-by: Felipe Franciosi <felipe@nutanix.com> > Signed-off-by: Peter Turschmid <peter.turschm@nutanix.com> > Signed-off-by: Raphael Norwitz <raphael.norwitz@nutanix.com> > --- > block/iscsi.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/block/iscsi.c b/block/iscsi.c > index 2aea7e3f13..cbd57294ab 100644 > --- a/block/iscsi.c > +++ b/block/iscsi.c > @@ -701,7 +701,7 @@ static int coroutine_fn iscsi_co_block_status(BlockDriverState *bs, > struct scsi_get_lba_status *lbas = NULL; > struct scsi_lba_status_descriptor *lbasd = NULL; > struct IscsiTask iTask; > - uint64_t lba; > + uint64_t lba, max_bytes; > int ret; > > iscsi_co_init_iscsitask(iscsilun, &iTask); > @@ -721,6 +721,7 @@ static int coroutine_fn iscsi_co_block_status(BlockDriverState *bs, > } > > lba = offset / iscsilun->block_size; > + max_bytes = (iscsilun->num_blocks - lba) * iscsilun->block_size; > > qemu_mutex_lock(&iscsilun->mutex); > retry: > @@ -764,7 +765,7 @@ retry: > goto out_unlock; > } > > - *pnum = (int64_t) lbasd->num_blocks * iscsilun->block_size; > + *pnum = MIN((int64_t) lbasd->num_blocks * iscsilun->block_size, max_bytes); > > if (lbasd->provisioning == SCSI_PROVISIONING_TYPE_DEALLOCATED || > lbasd->provisioning == SCSI_PROVISIONING_TYPE_ANCHORED) { > What about this? -- >8 -- diff --git a/block/iscsi.c b/block/iscsi.c index 2aea7e3f13..25598accbb 100644 --- a/block/iscsi.c +++ b/block/iscsi.c @@ -506,6 +506,11 @@ iscsi_allocmap_update(IscsiLun *iscsilun, int64_t offset, /* shrink to touch only completely contained clusters */ cl_num_shrunk = DIV_ROUND_UP(offset, iscsilun->cluster_size); nb_cls_shrunk = (offset + bytes) / iscsilun->cluster_size - cl_num_shrunk; + if (nb_cls_expanded >= iscsilun->allocmap_size + || nb_cls_shrunk >= iscsilun->allocmap_size) { + error_report("iSCSI invalid request: ..." /* TODO */); + return; + } if (allocated) { bitmap_set(iscsilun->allocmap, cl_num_expanded, nb_cls_expanded); } else { ---
Hi, > On Jan 23, 2020, at 5:46 PM, Philippe Mathieu-Daudé <philmd@redhat.com> wrote: > > On 1/23/20 1:44 PM, Felipe Franciosi wrote: >> When querying an iSCSI server for the provisioning status of blocks (via >> GET LBA STATUS), Qemu only validates that the response descriptor zero's >> LBA matches the one requested. Given the SCSI spec allows servers to >> respond with the status of blocks beyond the end of the LUN, Qemu may >> have its heap corrupted by clearing/setting too many bits at the end of >> its allocmap for the LUN. >> A malicious guest in control of the iSCSI server could carefully program >> Qemu's heap (by selectively setting the bitmap) and then smash it. >> This limits the number of bits that iscsi_co_block_status() will try to >> update in the allocmap so it can't overflow the bitmap. > > Please add: > > Fixes: CVE-2020-1711 (title of CVE if possible) I wasn't sure we had one yet. Kevin: can you do the needful in your branch? > Cc: qemu-stable@nongnu.org Yeah, that's there. > >> Signed-off-by: Felipe Franciosi <felipe@nutanix.com> >> Signed-off-by: Peter Turschmid <peter.turschm@nutanix.com> >> Signed-off-by: Raphael Norwitz <raphael.norwitz@nutanix.com> >> --- >> block/iscsi.c | 5 +++-- >> 1 file changed, 3 insertions(+), 2 deletions(-) >> diff --git a/block/iscsi.c b/block/iscsi.c >> index 2aea7e3f13..cbd57294ab 100644 >> --- a/block/iscsi.c >> +++ b/block/iscsi.c >> @@ -701,7 +701,7 @@ static int coroutine_fn iscsi_co_block_status(BlockDriverState *bs, >> struct scsi_get_lba_status *lbas = NULL; >> struct scsi_lba_status_descriptor *lbasd = NULL; >> struct IscsiTask iTask; >> - uint64_t lba; >> + uint64_t lba, max_bytes; >> int ret; >> iscsi_co_init_iscsitask(iscsilun, &iTask); >> @@ -721,6 +721,7 @@ static int coroutine_fn iscsi_co_block_status(BlockDriverState *bs, >> } >> lba = offset / iscsilun->block_size; >> + max_bytes = (iscsilun->num_blocks - lba) * iscsilun->block_size; >> qemu_mutex_lock(&iscsilun->mutex); >> retry: >> @@ -764,7 +765,7 @@ retry: >> goto out_unlock; >> } >> - *pnum = (int64_t) lbasd->num_blocks * iscsilun->block_size; >> + *pnum = MIN((int64_t) lbasd->num_blocks * iscsilun->block_size, max_bytes); >> if (lbasd->provisioning == SCSI_PROVISIONING_TYPE_DEALLOCATED || >> lbasd->provisioning == SCSI_PROVISIONING_TYPE_ANCHORED) { > > What about this? > > -- >8 -- > diff --git a/block/iscsi.c b/block/iscsi.c > index 2aea7e3f13..25598accbb 100644 > --- a/block/iscsi.c > +++ b/block/iscsi.c > @@ -506,6 +506,11 @@ iscsi_allocmap_update(IscsiLun *iscsilun, int64_t offset, > /* shrink to touch only completely contained clusters */ > cl_num_shrunk = DIV_ROUND_UP(offset, iscsilun->cluster_size); > nb_cls_shrunk = (offset + bytes) / iscsilun->cluster_size - cl_num_shrunk; > + if (nb_cls_expanded >= iscsilun->allocmap_size > + || nb_cls_shrunk >= iscsilun->allocmap_size) { > + error_report("iSCSI invalid request: ..." /* TODO */); > + return; > + } > if (allocated) { > bitmap_set(iscsilun->allocmap, cl_num_expanded, nb_cls_expanded); > } else { > --- I'm not sure the above is correct because (if I read this right) nb_cls_* represents the number of clusters, not the last cluster. Personally, I would have the checks (or "trim"s) closer to where they were issued (to fail sooner) and assert()s closer to bitmap (as no oob accesses should be happening at this point). There were also discussions about using safer (higher level) bitmaps for this. I'm always in favour of adding all reasonable checks. :) Cheers, Felipe
> Am 23.01.2020 um 22:29 schrieb Felipe Franciosi <felipe@nutanix.com>: > > Hi, > >> On Jan 23, 2020, at 5:46 PM, Philippe Mathieu-Daudé <philmd@redhat.com> wrote: >> >>> On 1/23/20 1:44 PM, Felipe Franciosi wrote: >>> When querying an iSCSI server for the provisioning status of blocks (via >>> GET LBA STATUS), Qemu only validates that the response descriptor zero's >>> LBA matches the one requested. Given the SCSI spec allows servers to >>> respond with the status of blocks beyond the end of the LUN, Qemu may >>> have its heap corrupted by clearing/setting too many bits at the end of >>> its allocmap for the LUN. >>> A malicious guest in control of the iSCSI server could carefully program >>> Qemu's heap (by selectively setting the bitmap) and then smash it. >>> This limits the number of bits that iscsi_co_block_status() will try to >>> update in the allocmap so it can't overflow the bitmap. >> >> Please add: >> >> Fixes: CVE-2020-1711 (title of CVE if possible) > > I wasn't sure we had one yet. Kevin: can you do the needful in your branch? > >> Cc: qemu-stable@nongnu.org > > Yeah, that's there. > >> >>> Signed-off-by: Felipe Franciosi <felipe@nutanix.com> >>> Signed-off-by: Peter Turschmid <peter.turschm@nutanix.com> >>> Signed-off-by: Raphael Norwitz <raphael.norwitz@nutanix.com> >>> --- >>> block/iscsi.c | 5 +++-- >>> 1 file changed, 3 insertions(+), 2 deletions(-) >>> diff --git a/block/iscsi.c b/block/iscsi.c >>> index 2aea7e3f13..cbd57294ab 100644 >>> --- a/block/iscsi.c >>> +++ b/block/iscsi.c >>> @@ -701,7 +701,7 @@ static int coroutine_fn iscsi_co_block_status(BlockDriverState *bs, >>> struct scsi_get_lba_status *lbas = NULL; >>> struct scsi_lba_status_descriptor *lbasd = NULL; >>> struct IscsiTask iTask; >>> - uint64_t lba; >>> + uint64_t lba, max_bytes; >>> int ret; >>> iscsi_co_init_iscsitask(iscsilun, &iTask); >>> @@ -721,6 +721,7 @@ static int coroutine_fn iscsi_co_block_status(BlockDriverState *bs, >>> } >>> lba = offset / iscsilun->block_size; >>> + max_bytes = (iscsilun->num_blocks - lba) * iscsilun->block_size; >>> qemu_mutex_lock(&iscsilun->mutex); >>> retry: >>> @@ -764,7 +765,7 @@ retry: >>> goto out_unlock; >>> } >>> - *pnum = (int64_t) lbasd->num_blocks * iscsilun->block_size; >>> + *pnum = MIN((int64_t) lbasd->num_blocks * iscsilun->block_size, max_bytes); >>> if (lbasd->provisioning == SCSI_PROVISIONING_TYPE_DEALLOCATED || >>> lbasd->provisioning == SCSI_PROVISIONING_TYPE_ANCHORED) { >> >> What about this? >> >> -- >8 -- >> diff --git a/block/iscsi.c b/block/iscsi.c >> index 2aea7e3f13..25598accbb 100644 >> --- a/block/iscsi.c >> +++ b/block/iscsi.c >> @@ -506,6 +506,11 @@ iscsi_allocmap_update(IscsiLun *iscsilun, int64_t offset, >> /* shrink to touch only completely contained clusters */ >> cl_num_shrunk = DIV_ROUND_UP(offset, iscsilun->cluster_size); >> nb_cls_shrunk = (offset + bytes) / iscsilun->cluster_size - cl_num_shrunk; >> + if (nb_cls_expanded >= iscsilun->allocmap_size >> + || nb_cls_shrunk >= iscsilun->allocmap_size) { >> + error_report("iSCSI invalid request: ..." /* TODO */); >> + return; >> + } >> if (allocated) { >> bitmap_set(iscsilun->allocmap, cl_num_expanded, nb_cls_expanded); >> } else { >> --- > > I'm not sure the above is correct because (if I read this right) > nb_cls_* represents the number of clusters, not the last cluster. > > Personally, I would have the checks (or "trim"s) closer to where they > were issued (to fail sooner) and assert()s closer to bitmap (as no oob > accesses should be happening at this point). There were also > discussions about using safer (higher level) bitmaps for this. I'm > always in favour of adding all reasonable checks. :) I would add assertions that cl_num + nb_cls <= allocmap_size before every set and clear. Peter
On 1/23/20 11:58 PM, Peter Lieven wrote: >> Am 23.01.2020 um 22:29 schrieb Felipe Franciosi <felipe@nutanix.com>: >>> On Jan 23, 2020, at 5:46 PM, Philippe Mathieu-Daudé <philmd@redhat.com> wrote: >>>> On 1/23/20 1:44 PM, Felipe Franciosi wrote: >>>> When querying an iSCSI server for the provisioning status of blocks (via >>>> GET LBA STATUS), Qemu only validates that the response descriptor zero's >>>> LBA matches the one requested. Given the SCSI spec allows servers to >>>> respond with the status of blocks beyond the end of the LUN, Qemu may >>>> have its heap corrupted by clearing/setting too many bits at the end of >>>> its allocmap for the LUN. >>>> A malicious guest in control of the iSCSI server could carefully program >>>> Qemu's heap (by selectively setting the bitmap) and then smash it. >>>> This limits the number of bits that iscsi_co_block_status() will try to >>>> update in the allocmap so it can't overflow the bitmap. >>> >>> Please add: >>> >>> Fixes: CVE-2020-1711 (title of CVE if possible) >> >> I wasn't sure we had one yet. Kevin: can you do the needful in your branch? >> >>> Cc: qemu-stable@nongnu.org >> >> Yeah, that's there. >> >>> >>>> Signed-off-by: Felipe Franciosi <felipe@nutanix.com> >>>> Signed-off-by: Peter Turschmid <peter.turschm@nutanix.com> >>>> Signed-off-by: Raphael Norwitz <raphael.norwitz@nutanix.com> >>>> --- >>>> block/iscsi.c | 5 +++-- >>>> 1 file changed, 3 insertions(+), 2 deletions(-) >>>> diff --git a/block/iscsi.c b/block/iscsi.c >>>> index 2aea7e3f13..cbd57294ab 100644 >>>> --- a/block/iscsi.c >>>> +++ b/block/iscsi.c >>>> @@ -701,7 +701,7 @@ static int coroutine_fn iscsi_co_block_status(BlockDriverState *bs, >>>> struct scsi_get_lba_status *lbas = NULL; >>>> struct scsi_lba_status_descriptor *lbasd = NULL; >>>> struct IscsiTask iTask; >>>> - uint64_t lba; >>>> + uint64_t lba, max_bytes; >>>> int ret; >>>> iscsi_co_init_iscsitask(iscsilun, &iTask); >>>> @@ -721,6 +721,7 @@ static int coroutine_fn iscsi_co_block_status(BlockDriverState *bs, >>>> } >>>> lba = offset / iscsilun->block_size; >>>> + max_bytes = (iscsilun->num_blocks - lba) * iscsilun->block_size; >>>> qemu_mutex_lock(&iscsilun->mutex); >>>> retry: >>>> @@ -764,7 +765,7 @@ retry: >>>> goto out_unlock; >>>> } >>>> - *pnum = (int64_t) lbasd->num_blocks * iscsilun->block_size; >>>> + *pnum = MIN((int64_t) lbasd->num_blocks * iscsilun->block_size, max_bytes); >>>> if (lbasd->provisioning == SCSI_PROVISIONING_TYPE_DEALLOCATED || >>>> lbasd->provisioning == SCSI_PROVISIONING_TYPE_ANCHORED) { >>> >>> What about this? >>> >>> -- >8 -- >>> diff --git a/block/iscsi.c b/block/iscsi.c >>> index 2aea7e3f13..25598accbb 100644 >>> --- a/block/iscsi.c >>> +++ b/block/iscsi.c >>> @@ -506,6 +506,11 @@ iscsi_allocmap_update(IscsiLun *iscsilun, int64_t offset, >>> /* shrink to touch only completely contained clusters */ >>> cl_num_shrunk = DIV_ROUND_UP(offset, iscsilun->cluster_size); >>> nb_cls_shrunk = (offset + bytes) / iscsilun->cluster_size - cl_num_shrunk; >>> + if (nb_cls_expanded >= iscsilun->allocmap_size >>> + || nb_cls_shrunk >= iscsilun->allocmap_size) { >>> + error_report("iSCSI invalid request: ..." /* TODO */); >>> + return; >>> + } >>> if (allocated) { >>> bitmap_set(iscsilun->allocmap, cl_num_expanded, nb_cls_expanded); >>> } else { >>> --- >> >> I'm not sure the above is correct because (if I read this right) >> nb_cls_* represents the number of clusters, not the last cluster. >> >> Personally, I would have the checks (or "trim"s) closer to where they >> were issued (to fail sooner) and assert()s closer to bitmap (as no oob >> accesses should be happening at this point). There were also >> discussions about using safer (higher level) bitmaps for this. I'm >> always in favour of adding all reasonable checks. :) > > I would add assertions that cl_num + nb_cls <= allocmap_size before every set and clear. The description starts with "A malicious guest in control of the iSCSI server ..." so asserting (and killing the VM) doesn't seem correct... I suppose the iSCSI protocol has some error to return for invalid requests. Also shouldn't we report some warning in case of such invalid request? So the management side can look at the 'malicious iSCSI server'?
> On Jan 24, 2020, at 10:04 AM, Philippe Mathieu-Daudé <philmd@redhat.com> wrote: > > On 1/23/20 11:58 PM, Peter Lieven wrote: >>> Am 23.01.2020 um 22:29 schrieb Felipe Franciosi <felipe@nutanix.com>: >>>> On Jan 23, 2020, at 5:46 PM, Philippe Mathieu-Daudé <philmd@redhat.com> wrote: >>>>> On 1/23/20 1:44 PM, Felipe Franciosi wrote: >>>>> When querying an iSCSI server for the provisioning status of blocks (via >>>>> GET LBA STATUS), Qemu only validates that the response descriptor zero's >>>>> LBA matches the one requested. Given the SCSI spec allows servers to >>>>> respond with the status of blocks beyond the end of the LUN, Qemu may >>>>> have its heap corrupted by clearing/setting too many bits at the end of >>>>> its allocmap for the LUN. >>>>> A malicious guest in control of the iSCSI server could carefully program >>>>> Qemu's heap (by selectively setting the bitmap) and then smash it. >>>>> This limits the number of bits that iscsi_co_block_status() will try to >>>>> update in the allocmap so it can't overflow the bitmap. >>>> >>>> Please add: >>>> >>>> Fixes: CVE-2020-1711 (title of CVE if possible) >>> >>> I wasn't sure we had one yet. Kevin: can you do the needful in your branch? >>> >>>> Cc: qemu-stable@nongnu.org >>> >>> Yeah, that's there. >>> >>>> >>>>> Signed-off-by: Felipe Franciosi <felipe@nutanix.com> >>>>> Signed-off-by: Peter Turschmid <peter.turschm@nutanix.com> >>>>> Signed-off-by: Raphael Norwitz <raphael.norwitz@nutanix.com> >>>>> --- >>>>> block/iscsi.c | 5 +++-- >>>>> 1 file changed, 3 insertions(+), 2 deletions(-) >>>>> diff --git a/block/iscsi.c b/block/iscsi.c >>>>> index 2aea7e3f13..cbd57294ab 100644 >>>>> --- a/block/iscsi.c >>>>> +++ b/block/iscsi.c >>>>> @@ -701,7 +701,7 @@ static int coroutine_fn iscsi_co_block_status(BlockDriverState *bs, >>>>> struct scsi_get_lba_status *lbas = NULL; >>>>> struct scsi_lba_status_descriptor *lbasd = NULL; >>>>> struct IscsiTask iTask; >>>>> - uint64_t lba; >>>>> + uint64_t lba, max_bytes; >>>>> int ret; >>>>> iscsi_co_init_iscsitask(iscsilun, &iTask); >>>>> @@ -721,6 +721,7 @@ static int coroutine_fn iscsi_co_block_status(BlockDriverState *bs, >>>>> } >>>>> lba = offset / iscsilun->block_size; >>>>> + max_bytes = (iscsilun->num_blocks - lba) * iscsilun->block_size; >>>>> qemu_mutex_lock(&iscsilun->mutex); >>>>> retry: >>>>> @@ -764,7 +765,7 @@ retry: >>>>> goto out_unlock; >>>>> } >>>>> - *pnum = (int64_t) lbasd->num_blocks * iscsilun->block_size; >>>>> + *pnum = MIN((int64_t) lbasd->num_blocks * iscsilun->block_size, max_bytes); >>>>> if (lbasd->provisioning == SCSI_PROVISIONING_TYPE_DEALLOCATED || >>>>> lbasd->provisioning == SCSI_PROVISIONING_TYPE_ANCHORED) { >>>> >>>> What about this? >>>> >>>> -- >8 -- >>>> diff --git a/block/iscsi.c b/block/iscsi.c >>>> index 2aea7e3f13..25598accbb 100644 >>>> --- a/block/iscsi.c >>>> +++ b/block/iscsi.c >>>> @@ -506,6 +506,11 @@ iscsi_allocmap_update(IscsiLun *iscsilun, int64_t offset, >>>> /* shrink to touch only completely contained clusters */ >>>> cl_num_shrunk = DIV_ROUND_UP(offset, iscsilun->cluster_size); >>>> nb_cls_shrunk = (offset + bytes) / iscsilun->cluster_size - cl_num_shrunk; >>>> + if (nb_cls_expanded >= iscsilun->allocmap_size >>>> + || nb_cls_shrunk >= iscsilun->allocmap_size) { >>>> + error_report("iSCSI invalid request: ..." /* TODO */); >>>> + return; >>>> + } >>>> if (allocated) { >>>> bitmap_set(iscsilun->allocmap, cl_num_expanded, nb_cls_expanded); >>>> } else { >>>> --- >>> >>> I'm not sure the above is correct because (if I read this right) >>> nb_cls_* represents the number of clusters, not the last cluster. >>> >>> Personally, I would have the checks (or "trim"s) closer to where they >>> were issued (to fail sooner) and assert()s closer to bitmap (as no oob >>> accesses should be happening at this point). There were also >>> discussions about using safer (higher level) bitmaps for this. I'm >>> always in favour of adding all reasonable checks. :) >> I would add assertions that cl_num + nb_cls <= allocmap_size before every set and clear. > > The description starts with "A malicious guest in control of the iSCSI server ..." so asserting (and killing the VM) doesn't seem correct... Correct. That's why I would have the proper checks (or "trim"s) closer to where they were issued to fail sooner. What I meant is that if a guest issues any operation that spans past the end of the drive, then the operation stops there and an error is returned accordingly. This means nothing should ever try to touch these bitmaps out of bounds. Nevertheless, and further to that, assert()s can be used closer to where the bitmap is touched to catch programming errors. > I suppose the iSCSI protocol has some error to return for invalid requests. Which invalid you are referring to? From the initiator or the target? AFAICT the problem is that the SCSI SPEC doesn't limit a target to respond provisioning status past the (current) end of the LUN (either because this was not deemed important to stress, was forgotten, or is intentionally allowed). > > Also shouldn't we report some warning in case of such invalid request? So the management side can look at the 'malicious iSCSI server'? > I think having the option to do so is a good idea. There are two cases I can think of that you run into a "malicious" storage server: 1) Someone hacked your storage server 2) Your control plane allows your compute to connect to a user provided storage service Thinking as an admin, if I only allow storage servers I provide, then I want to see such warnings. If I let people point the VMM to dodgy servers, then I probably don't want the log spam. F.
Am 24.01.2020 um 11:48 hat Felipe Franciosi geschrieben: > > On Jan 24, 2020, at 10:04 AM, Philippe Mathieu-Daudé <philmd@redhat.com> wrote: > > On 1/23/20 11:58 PM, Peter Lieven wrote: > >>> Am 23.01.2020 um 22:29 schrieb Felipe Franciosi <felipe@nutanix.com>: > >>>> On Jan 23, 2020, at 5:46 PM, Philippe Mathieu-Daudé <philmd@redhat.com> wrote: > >>>>> On 1/23/20 1:44 PM, Felipe Franciosi wrote: > >>>>> When querying an iSCSI server for the provisioning status of blocks (via > >>>>> GET LBA STATUS), Qemu only validates that the response descriptor zero's > >>>>> LBA matches the one requested. Given the SCSI spec allows servers to > >>>>> respond with the status of blocks beyond the end of the LUN, Qemu may > >>>>> have its heap corrupted by clearing/setting too many bits at the end of > >>>>> its allocmap for the LUN. > >>>>> A malicious guest in control of the iSCSI server could carefully program > >>>>> Qemu's heap (by selectively setting the bitmap) and then smash it. > >>>>> This limits the number of bits that iscsi_co_block_status() will try to > >>>>> update in the allocmap so it can't overflow the bitmap. > >>>> > >>>> Please add: > >>>> > >>>> Fixes: CVE-2020-1711 (title of CVE if possible) > >>> > >>> I wasn't sure we had one yet. Kevin: can you do the needful in your branch? I added the CVE number, though I don't have a title. > >>>> Cc: qemu-stable@nongnu.org > >>> > >>> Yeah, that's there. > >>> > >>>> > >>>>> Signed-off-by: Felipe Franciosi <felipe@nutanix.com> > >>>>> Signed-off-by: Peter Turschmid <peter.turschm@nutanix.com> > >>>>> Signed-off-by: Raphael Norwitz <raphael.norwitz@nutanix.com> > >>>>> --- > >>>>> block/iscsi.c | 5 +++-- > >>>>> 1 file changed, 3 insertions(+), 2 deletions(-) > >>>>> diff --git a/block/iscsi.c b/block/iscsi.c > >>>>> index 2aea7e3f13..cbd57294ab 100644 > >>>>> --- a/block/iscsi.c > >>>>> +++ b/block/iscsi.c > >>>>> @@ -701,7 +701,7 @@ static int coroutine_fn iscsi_co_block_status(BlockDriverState *bs, > >>>>> struct scsi_get_lba_status *lbas = NULL; > >>>>> struct scsi_lba_status_descriptor *lbasd = NULL; > >>>>> struct IscsiTask iTask; > >>>>> - uint64_t lba; > >>>>> + uint64_t lba, max_bytes; > >>>>> int ret; > >>>>> iscsi_co_init_iscsitask(iscsilun, &iTask); > >>>>> @@ -721,6 +721,7 @@ static int coroutine_fn iscsi_co_block_status(BlockDriverState *bs, > >>>>> } > >>>>> lba = offset / iscsilun->block_size; > >>>>> + max_bytes = (iscsilun->num_blocks - lba) * iscsilun->block_size; > >>>>> qemu_mutex_lock(&iscsilun->mutex); > >>>>> retry: > >>>>> @@ -764,7 +765,7 @@ retry: > >>>>> goto out_unlock; > >>>>> } > >>>>> - *pnum = (int64_t) lbasd->num_blocks * iscsilun->block_size; > >>>>> + *pnum = MIN((int64_t) lbasd->num_blocks * iscsilun->block_size, max_bytes); > >>>>> if (lbasd->provisioning == SCSI_PROVISIONING_TYPE_DEALLOCATED || > >>>>> lbasd->provisioning == SCSI_PROVISIONING_TYPE_ANCHORED) { > >>>> > >>>> What about this? > >>>> > >>>> -- >8 -- > >>>> diff --git a/block/iscsi.c b/block/iscsi.c > >>>> index 2aea7e3f13..25598accbb 100644 > >>>> --- a/block/iscsi.c > >>>> +++ b/block/iscsi.c > >>>> @@ -506,6 +506,11 @@ iscsi_allocmap_update(IscsiLun *iscsilun, int64_t offset, > >>>> /* shrink to touch only completely contained clusters */ > >>>> cl_num_shrunk = DIV_ROUND_UP(offset, iscsilun->cluster_size); > >>>> nb_cls_shrunk = (offset + bytes) / iscsilun->cluster_size - cl_num_shrunk; > >>>> + if (nb_cls_expanded >= iscsilun->allocmap_size > >>>> + || nb_cls_shrunk >= iscsilun->allocmap_size) { > >>>> + error_report("iSCSI invalid request: ..." /* TODO */); > >>>> + return; > >>>> + } > >>>> if (allocated) { > >>>> bitmap_set(iscsilun->allocmap, cl_num_expanded, nb_cls_expanded); > >>>> } else { > >>>> --- > >>> > >>> I'm not sure the above is correct because (if I read this right) > >>> nb_cls_* represents the number of clusters, not the last cluster. > >>> > >>> Personally, I would have the checks (or "trim"s) closer to where they > >>> were issued (to fail sooner) and assert()s closer to bitmap (as no oob > >>> accesses should be happening at this point). There were also > >>> discussions about using safer (higher level) bitmaps for this. I'm > >>> always in favour of adding all reasonable checks. :) > >> I would add assertions that cl_num + nb_cls <= allocmap_size before > >> every set and clear. This makes sense to me. Do you want to send this as a follow-up patch? I'd like to keep the CVE fix itself minimal. > > The description starts with "A malicious guest in control of the > > iSCSI server ..." so asserting (and killing the VM) doesn't seem > > correct... assert() isn't an error check, but it means that we deem it impossible for the assertion to fail. This would be the case because we fixed (in this patch) the only code path that we think could cause the problem. We would only add it to find other buggy code paths that we missed or that are added later. > Correct. That's why I would have the proper checks (or "trim"s) closer > to where they were issued to fail sooner. What I meant is that if a > guest issues any operation that spans past the end of the drive, then > the operation stops there and an error is returned accordingly. Guests can't issue operations that span past the end of the drive. They would return an error befor the iscsi driver is even called. The only reason why we get such a request here is because of an internal call with BDRV_REQUEST_MAX_BYTES. Maybe this should actually be changed into MIN(BDRV_REQUEST_MAX_BYTES, bs->total_sectors * BDRV_SECTOR_SIZE), and then iscsi_co_block_status() could assert that the request doesn't span past the end of the drive. > This means nothing should ever try to touch these bitmaps out of > bounds. Nevertheless, and further to that, assert()s can be used > closer to where the bitmap is touched to catch programming errors. > > > I suppose the iSCSI protocol has some error to return for invalid > > requests. > > Which invalid you are referring to? From the initiator or the target? > AFAICT the problem is that the SCSI SPEC doesn't limit a target to > respond provisioning status past the (current) end of the LUN (either > because this was not deemed important to stress, was forgotten, or is > intentionally allowed). In any case, we don't get an invalid request here. We are who made the request. It's an unexpected response that we got. > > Also shouldn't we report some warning in case of such invalid > > request? So the management side can look at the 'malicious iSCSI > > server'? > > I think having the option to do so is a good idea. There are two cases > I can think of that you run into a "malicious" storage server: > 1) Someone hacked your storage server > 2) Your control plane allows your compute to connect to a user > provided storage service > > Thinking as an admin, if I only allow storage servers I provide, then > I want to see such warnings. If I let people point the VMM to dodgy > servers, then I probably don't want the log spam. For this reason, we generally don't log things for failed I/O requests. If we wanted to introduce it, we'd better find a way to do so consistently everywhere and not just in a single place with a one-off option. Kevin
On 1/24/20 2:39 PM, Kevin Wolf wrote: > Am 24.01.2020 um 11:48 hat Felipe Franciosi geschrieben: >>> On Jan 24, 2020, at 10:04 AM, Philippe Mathieu-Daudé <philmd@redhat.com> wrote: >>> On 1/23/20 11:58 PM, Peter Lieven wrote: >>>>> Am 23.01.2020 um 22:29 schrieb Felipe Franciosi <felipe@nutanix.com>: >>>>>> On Jan 23, 2020, at 5:46 PM, Philippe Mathieu-Daudé <philmd@redhat.com> wrote: >>>>>>> On 1/23/20 1:44 PM, Felipe Franciosi wrote: >>>>>>> When querying an iSCSI server for the provisioning status of blocks (via >>>>>>> GET LBA STATUS), Qemu only validates that the response descriptor zero's >>>>>>> LBA matches the one requested. Given the SCSI spec allows servers to >>>>>>> respond with the status of blocks beyond the end of the LUN, Qemu may >>>>>>> have its heap corrupted by clearing/setting too many bits at the end of >>>>>>> its allocmap for the LUN. >>>>>>> A malicious guest in control of the iSCSI server could carefully program >>>>>>> Qemu's heap (by selectively setting the bitmap) and then smash it. >>>>>>> This limits the number of bits that iscsi_co_block_status() will try to >>>>>>> update in the allocmap so it can't overflow the bitmap. >>>>>> >>>>>> Please add: >>>>>> >>>>>> Fixes: CVE-2020-1711 (title of CVE if possible) >>>>> >>>>> I wasn't sure we had one yet. Kevin: can you do the needful in your branch? > > I added the CVE number, though I don't have a title. Thanks. [...] >>> The description starts with "A malicious guest in control of the >>> iSCSI server ..." so asserting (and killing the VM) doesn't seem >>> correct... > > assert() isn't an error check, but it means that we deem it impossible > for the assertion to fail. This would be the case because we fixed (in > this patch) the only code path that we think could cause the problem. > > We would only add it to find other buggy code paths that we missed or > that are added later. > >> Correct. That's why I would have the proper checks (or "trim"s) closer >> to where they were issued to fail sooner. What I meant is that if a >> guest issues any operation that spans past the end of the drive, then >> the operation stops there and an error is returned accordingly. > > Guests can't issue operations that span past the end of the drive. They > would return an error befor the iscsi driver is even called. > > The only reason why we get such a request here is because of an internal > call with BDRV_REQUEST_MAX_BYTES. Maybe this should actually be changed > into MIN(BDRV_REQUEST_MAX_BYTES, bs->total_sectors * BDRV_SECTOR_SIZE), > and then iscsi_co_block_status() could assert that the request doesn't > span past the end of the drive. > >> This means nothing should ever try to touch these bitmaps out of >> bounds. Nevertheless, and further to that, assert()s can be used >> closer to where the bitmap is touched to catch programming errors. >> >>> I suppose the iSCSI protocol has some error to return for invalid >>> requests. >> >> Which invalid you are referring to? From the initiator or the target? >> AFAICT the problem is that the SCSI SPEC doesn't limit a target to >> respond provisioning status past the (current) end of the LUN (either >> because this was not deemed important to stress, was forgotten, or is >> intentionally allowed). > > In any case, we don't get an invalid request here. We are who made the > request. It's an unexpected response that we got. > >>> Also shouldn't we report some warning in case of such invalid >>> request? So the management side can look at the 'malicious iSCSI >>> server'? >> >> I think having the option to do so is a good idea. There are two cases >> I can think of that you run into a "malicious" storage server: >> 1) Someone hacked your storage server >> 2) Your control plane allows your compute to connect to a user >> provided storage service >> >> Thinking as an admin, if I only allow storage servers I provide, then >> I want to see such warnings. If I let people point the VMM to dodgy >> servers, then I probably don't want the log spam. > > For this reason, we generally don't log things for failed I/O requests. > If we wanted to introduce it, we'd better find a way to do so > consistently everywhere and not just in a single place with a one-off > option. I'm just suggesting to use error_report().
Am 24.01.2020 um 14:42 hat Philippe Mathieu-Daudé geschrieben: > On 1/24/20 2:39 PM, Kevin Wolf wrote: > > Am 24.01.2020 um 11:48 hat Felipe Franciosi geschrieben: > > > On Jan 24, 2020, at 10:04 AM, Philippe Mathieu-Daudé <philmd@redhat.com> wrote: > > > > Also shouldn't we report some warning in case of such invalid > > > > request? So the management side can look at the 'malicious iSCSI > > > > server'? > > > > > > I think having the option to do so is a good idea. There are two cases > > > I can think of that you run into a "malicious" storage server: > > > 1) Someone hacked your storage server > > > 2) Your control plane allows your compute to connect to a user > > > provided storage service > > > > > > Thinking as an admin, if I only allow storage servers I provide, then > > > I want to see such warnings. If I let people point the VMM to dodgy > > > servers, then I probably don't want the log spam. > > > > For this reason, we generally don't log things for failed I/O requests. > > If we wanted to introduce it, we'd better find a way to do so > > consistently everywhere and not just in a single place with a one-off > > option. > > I'm just suggesting to use error_report(). If you do this unconditionally with an untrusted server, you allow it to DoS you by filling up your disk with error logs. Kevin
On 1/24/20 2:52 PM, Kevin Wolf wrote: > Am 24.01.2020 um 14:42 hat Philippe Mathieu-Daudé geschrieben: >> On 1/24/20 2:39 PM, Kevin Wolf wrote: >>> Am 24.01.2020 um 11:48 hat Felipe Franciosi geschrieben: >>>> On Jan 24, 2020, at 10:04 AM, Philippe Mathieu-Daudé <philmd@redhat.com> wrote: >>>>> Also shouldn't we report some warning in case of such invalid >>>>> request? So the management side can look at the 'malicious iSCSI >>>>> server'? >>>> >>>> I think having the option to do so is a good idea. There are two cases >>>> I can think of that you run into a "malicious" storage server: >>>> 1) Someone hacked your storage server >>>> 2) Your control plane allows your compute to connect to a user >>>> provided storage service >>>> >>>> Thinking as an admin, if I only allow storage servers I provide, then >>>> I want to see such warnings. If I let people point the VMM to dodgy >>>> servers, then I probably don't want the log spam. >>> >>> For this reason, we generally don't log things for failed I/O requests. >>> If we wanted to introduce it, we'd better find a way to do so >>> consistently everywhere and not just in a single place with a one-off >>> option. >> >> I'm just suggesting to use error_report(). > > If you do this unconditionally with an untrusted server, you allow it to > DoS you by filling up your disk with error logs. Good point :) error_report_once() then!
Hi guys, (Cc'ing Jon) On 1/23/20 5:59 PM, Kevin Wolf wrote: > Am 23.01.2020 um 13:44 hat Felipe Franciosi geschrieben: >> When querying an iSCSI server for the provisioning status of blocks (via >> GET LBA STATUS), Qemu only validates that the response descriptor zero's >> LBA matches the one requested. Given the SCSI spec allows servers to >> respond with the status of blocks beyond the end of the LUN, Qemu may >> have its heap corrupted by clearing/setting too many bits at the end of >> its allocmap for the LUN. >> >> A malicious guest in control of the iSCSI server could carefully program >> Qemu's heap (by selectively setting the bitmap) and then smash it. >> >> This limits the number of bits that iscsi_co_block_status() will try to >> update in the allocmap so it can't overflow the bitmap. >> >> Signed-off-by: Felipe Franciosi <felipe@nutanix.com> >> Signed-off-by: Peter Turschmid <peter.turschm@nutanix.com> >> Signed-off-by: Raphael Norwitz <raphael.norwitz@nutanix.com> > > Thanks, applied to the block branch. We are trying to reproduce this, do you already have some code that triggered this issue? I am new to the block API, I noticed the block/blkdebug.c file with 'blkdebug' option, is it helpful to reproduce this issue via HMP? Any suggestion what would be the easier/quicker way to test this? Looking for iotests examples I see tests/qemu-iotests/147 providing a BuiltinNBD class. Is it the recommended way to go, to mock a iSCSI server? Thanks, Phil.
Am 28.01.2020 um 13:30 hat Philippe Mathieu-Daudé geschrieben: > Hi guys, > > (Cc'ing Jon) > > On 1/23/20 5:59 PM, Kevin Wolf wrote: > > Am 23.01.2020 um 13:44 hat Felipe Franciosi geschrieben: > > > When querying an iSCSI server for the provisioning status of blocks (via > > > GET LBA STATUS), Qemu only validates that the response descriptor zero's > > > LBA matches the one requested. Given the SCSI spec allows servers to > > > respond with the status of blocks beyond the end of the LUN, Qemu may > > > have its heap corrupted by clearing/setting too many bits at the end of > > > its allocmap for the LUN. > > > > > > A malicious guest in control of the iSCSI server could carefully program > > > Qemu's heap (by selectively setting the bitmap) and then smash it. > > > > > > This limits the number of bits that iscsi_co_block_status() will try to > > > update in the allocmap so it can't overflow the bitmap. > > > > > > Signed-off-by: Felipe Franciosi <felipe@nutanix.com> > > > Signed-off-by: Peter Turschmid <peter.turschm@nutanix.com> > > > Signed-off-by: Raphael Norwitz <raphael.norwitz@nutanix.com> > > > > Thanks, applied to the block branch. > > We are trying to reproduce this, do you already have some code that > triggered this issue? I don't, maybe Felipe has a reproducer that would crash QEMU. > I am new to the block API, I noticed the block/blkdebug.c file with > 'blkdebug' option, is it helpful to reproduce this issue via HMP? > > Any suggestion what would be the easier/quicker way to test this? On the QEMU side, you just need to connect to an iscsi backend. The malicious response must come from the server, which is not part of QEMU. So no, blkdebug won't help you. > Looking for iotests examples I see tests/qemu-iotests/147 providing a > BuiltinNBD class. Is it the recommended way to go, to mock a iSCSI server? That BuiltinNBD class doesn't implement an NBD server, but it just starts the built-in NBD server in QEMU and runs some tests against it. QEMU doesn't have a built-in iscsi server. Kevin
Hi and sorry for the delay, this got lost in my inbox. > On Jan 28, 2020, at 12:42 PM, Kevin Wolf <kwolf@redhat.com> wrote: > > Am 28.01.2020 um 13:30 hat Philippe Mathieu-Daudé geschrieben: >> Hi guys, >> >> (Cc'ing Jon) >> >> On 1/23/20 5:59 PM, Kevin Wolf wrote: >>> Am 23.01.2020 um 13:44 hat Felipe Franciosi geschrieben: >>>> When querying an iSCSI server for the provisioning status of blocks (via >>>> GET LBA STATUS), Qemu only validates that the response descriptor zero's >>>> LBA matches the one requested. Given the SCSI spec allows servers to >>>> respond with the status of blocks beyond the end of the LUN, Qemu may >>>> have its heap corrupted by clearing/setting too many bits at the end of >>>> its allocmap for the LUN. >>>> >>>> A malicious guest in control of the iSCSI server could carefully program >>>> Qemu's heap (by selectively setting the bitmap) and then smash it. >>>> >>>> This limits the number of bits that iscsi_co_block_status() will try to >>>> update in the allocmap so it can't overflow the bitmap. >>>> >>>> Signed-off-by: Felipe Franciosi <felipe@nutanix.com> >>>> Signed-off-by: Peter Turschmid <peter.turschm@nutanix.com> >>>> Signed-off-by: Raphael Norwitz <raphael.norwitz@nutanix.com> >>> >>> Thanks, applied to the block branch. >> >> We are trying to reproduce this, do you already have some code that >> triggered this issue? > > I don't, maybe Felipe has a reproducer that would crash QEMU. It's not hard. 1) Attach an iSCSI LUN to Qemu. Do not read data from it (so Qemu won't populate the bitmap). 2) Issue a read larger than 64 blocks towards the end of the LUN. Qemu will attempt a GET LBA STATUS to find out if the provisioning status of the blocks. 3) Get your iSCSI server to respond with more blocks than is available on the LUN for that LBA. We did that with an iSCSI server written exclusively for such tests. All the responses can be controlled or fuzzed. But it should be easy to modify any existing server (eg. iet). F. > >> I am new to the block API, I noticed the block/blkdebug.c file with >> 'blkdebug' option, is it helpful to reproduce this issue via HMP? >> >> Any suggestion what would be the easier/quicker way to test this? > > On the QEMU side, you just need to connect to an iscsi backend. The > malicious response must come from the server, which is not part of QEMU. > So no, blkdebug won't help you. > >> Looking for iotests examples I see tests/qemu-iotests/147 providing a >> BuiltinNBD class. Is it the recommended way to go, to mock a iSCSI server? > > That BuiltinNBD class doesn't implement an NBD server, but it just > starts the built-in NBD server in QEMU and runs some tests against it. > QEMU doesn't have a built-in iscsi server. > > Kevin
diff --git a/block/iscsi.c b/block/iscsi.c index 2aea7e3f13..cbd57294ab 100644 --- a/block/iscsi.c +++ b/block/iscsi.c @@ -701,7 +701,7 @@ static int coroutine_fn iscsi_co_block_status(BlockDriverState *bs, struct scsi_get_lba_status *lbas = NULL; struct scsi_lba_status_descriptor *lbasd = NULL; struct IscsiTask iTask; - uint64_t lba; + uint64_t lba, max_bytes; int ret; iscsi_co_init_iscsitask(iscsilun, &iTask); @@ -721,6 +721,7 @@ static int coroutine_fn iscsi_co_block_status(BlockDriverState *bs, } lba = offset / iscsilun->block_size; + max_bytes = (iscsilun->num_blocks - lba) * iscsilun->block_size; qemu_mutex_lock(&iscsilun->mutex); retry: @@ -764,7 +765,7 @@ retry: goto out_unlock; } - *pnum = (int64_t) lbasd->num_blocks * iscsilun->block_size; + *pnum = MIN((int64_t) lbasd->num_blocks * iscsilun->block_size, max_bytes); if (lbasd->provisioning == SCSI_PROVISIONING_TYPE_DEALLOCATED || lbasd->provisioning == SCSI_PROVISIONING_TYPE_ANCHORED) {