diff mbox series

[v10,10/10] vfio: Don't issue full 2^64 unmap

Message ID 20201008171558.410886-11-jean-philippe@linaro.org (mailing list archive)
State New, archived
Headers show
Series virtio-iommu: VFIO integration | expand

Commit Message

Jean-Philippe Brucker Oct. 8, 2020, 5:15 p.m. UTC
IOMMUs may declare memory regions spanning from 0 to UINT64_MAX. When
attempting to deal with such region, vfio_listener_region_del() passes a
size of 2^64 to int128_get64() which throws an assertion failure.  Even
ignoring this, the VFIO_IOMMU_DMA_MAP ioctl cannot handle this size
since the size field is 64-bit. Split the request in two.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---
For me this happens when memory_region_iommu_set_page_size_mask()
returns an error because a hotplugged endpoint uses an incompatible page
mask. vfio_connect_container() releases the memory listener which calls
region_del() with the 2^64 IOMMU region. There are probably other ways
to reach this.
---
 hw/vfio/common.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

Comments

Alex Williamson Oct. 8, 2020, 9:22 p.m. UTC | #1
On Thu,  8 Oct 2020 19:15:58 +0200
Jean-Philippe Brucker <jean-philippe@linaro.org> wrote:

> IOMMUs may declare memory regions spanning from 0 to UINT64_MAX. When
> attempting to deal with such region, vfio_listener_region_del() passes a
> size of 2^64 to int128_get64() which throws an assertion failure.  Even
> ignoring this, the VFIO_IOMMU_DMA_MAP ioctl cannot handle this size
> since the size field is 64-bit. Split the request in two.
> 
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> ---
> For me this happens when memory_region_iommu_set_page_size_mask()
> returns an error because a hotplugged endpoint uses an incompatible page
> mask. vfio_connect_container() releases the memory listener which calls
> region_del() with the 2^64 IOMMU region. There are probably other ways
> to reach this.
> ---
>  hw/vfio/common.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index e66054b02a7..e90a89c389e 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -797,6 +797,17 @@ static void vfio_listener_region_del(MemoryListener *listener,
>      }
>  
>      if (try_unmap) {
> +        if (llsize == int128_2_64()) {
> +            /* The unmap ioctl doesn't accept a full 64-bit span. */
> +            llsize = int128_rshift(llsize, 1);
> +            ret = vfio_dma_unmap(container, iova, int128_get64(llsize));
> +            if (ret) {
> +                error_report("vfio_dma_unmap(%p, 0x%"HWADDR_PRIx", "
> +                             "0x%"HWADDR_PRIx") = %d (%m)",
> +                             container, iova, int128_get64(llsize), ret);
> +            }
> +            iova += int128_get64(llsize);
> +        }
>          ret = vfio_dma_unmap(container, iova, int128_get64(llsize));
>          if (ret) {
>              error_report("vfio_dma_unmap(%p, 0x%"HWADDR_PRIx", "

We're still susceptible that splitting the range in two could result in
unmap calls that attempt to bisect a mapping that spans both ranges.
Both unmap calls would fail in that case.  I think we could solve this
more completely with a high water marker, but this probably good enough
for now.

Acked-by: Alex Williamson <alex.williamson@redhat.com>
Eric Auger Oct. 16, 2020, 9:47 a.m. UTC | #2
Hi Jean,

On 10/8/20 7:15 PM, Jean-Philippe Brucker wrote:
> IOMMUs may declare memory regions spanning from 0 to UINT64_MAX. When
> attempting to deal with such region, vfio_listener_region_del() passes a
> size of 2^64 to int128_get64() which throws an assertion failure.  Even
> ignoring this, the VFIO_IOMMU_DMA_MAP ioctl cannot handle this size
> since the size field is 64-bit. Split the request in two.
> 
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>

Thanks

Eric
> ---
> For me this happens when memory_region_iommu_set_page_size_mask()
> returns an error because a hotplugged endpoint uses an incompatible page
> mask. vfio_connect_container() releases the memory listener which calls
> region_del() with the 2^64 IOMMU region. There are probably other ways
> to reach this.
> ---
>  hw/vfio/common.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index e66054b02a7..e90a89c389e 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -797,6 +797,17 @@ static void vfio_listener_region_del(MemoryListener *listener,
>      }
>  
>      if (try_unmap) {
> +        if (llsize == int128_2_64()) {
> +            /* The unmap ioctl doesn't accept a full 64-bit span. */
> +            llsize = int128_rshift(llsize, 1);
> +            ret = vfio_dma_unmap(container, iova, int128_get64(llsize));
> +            if (ret) {
> +                error_report("vfio_dma_unmap(%p, 0x%"HWADDR_PRIx", "
> +                             "0x%"HWADDR_PRIx") = %d (%m)",
> +                             container, iova, int128_get64(llsize), ret);
> +            }
> +            iova += int128_get64(llsize);
> +        }
>          ret = vfio_dma_unmap(container, iova, int128_get64(llsize));
>          if (ret) {
>              error_report("vfio_dma_unmap(%p, 0x%"HWADDR_PRIx", "
>
Michael S. Tsirkin Oct. 30, 2020, 10:25 a.m. UTC | #3
On Thu, Oct 08, 2020 at 03:22:14PM -0600, Alex Williamson wrote:
> On Thu,  8 Oct 2020 19:15:58 +0200
> Jean-Philippe Brucker <jean-philippe@linaro.org> wrote:
> 
> > IOMMUs may declare memory regions spanning from 0 to UINT64_MAX. When
> > attempting to deal with such region, vfio_listener_region_del() passes a
> > size of 2^64 to int128_get64() which throws an assertion failure.  Even
> > ignoring this, the VFIO_IOMMU_DMA_MAP ioctl cannot handle this size
> > since the size field is 64-bit. Split the request in two.
> > 
> > Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> > ---
> > For me this happens when memory_region_iommu_set_page_size_mask()
> > returns an error because a hotplugged endpoint uses an incompatible page
> > mask. vfio_connect_container() releases the memory listener which calls
> > region_del() with the 2^64 IOMMU region. There are probably other ways
> > to reach this.
> > ---
> >  hw/vfio/common.c | 11 +++++++++++
> >  1 file changed, 11 insertions(+)
> > 
> > diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> > index e66054b02a7..e90a89c389e 100644
> > --- a/hw/vfio/common.c
> > +++ b/hw/vfio/common.c
> > @@ -797,6 +797,17 @@ static void vfio_listener_region_del(MemoryListener *listener,
> >      }
> >  
> >      if (try_unmap) {
> > +        if (llsize == int128_2_64()) {
> > +            /* The unmap ioctl doesn't accept a full 64-bit span. */
> > +            llsize = int128_rshift(llsize, 1);
> > +            ret = vfio_dma_unmap(container, iova, int128_get64(llsize));
> > +            if (ret) {
> > +                error_report("vfio_dma_unmap(%p, 0x%"HWADDR_PRIx", "
> > +                             "0x%"HWADDR_PRIx") = %d (%m)",
> > +                             container, iova, int128_get64(llsize), ret);
> > +            }
> > +            iova += int128_get64(llsize);
> > +        }
> >          ret = vfio_dma_unmap(container, iova, int128_get64(llsize));
> >          if (ret) {
> >              error_report("vfio_dma_unmap(%p, 0x%"HWADDR_PRIx", "
> 
> We're still susceptible that splitting the range in two could result in
> unmap calls that attempt to bisect a mapping that spans both ranges.
> Both unmap calls would fail in that case.  I think we could solve this
> more completely with a high water marker, but this probably good enough
> for now.
> 
> Acked-by: Alex Williamson <alex.williamson@redhat.com>


Are you merging this then?
If yes

Acked-by: Michael S. Tsirkin <mst@redhat.com>
Alex Williamson Oct. 30, 2020, 5:26 p.m. UTC | #4
On Fri, 30 Oct 2020 06:25:34 -0400
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Thu, Oct 08, 2020 at 03:22:14PM -0600, Alex Williamson wrote:
> > On Thu,  8 Oct 2020 19:15:58 +0200
> > Jean-Philippe Brucker <jean-philippe@linaro.org> wrote:
> >   
> > > IOMMUs may declare memory regions spanning from 0 to UINT64_MAX. When
> > > attempting to deal with such region, vfio_listener_region_del() passes a
> > > size of 2^64 to int128_get64() which throws an assertion failure.  Even
> > > ignoring this, the VFIO_IOMMU_DMA_MAP ioctl cannot handle this size
> > > since the size field is 64-bit. Split the request in two.
> > > 
> > > Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> > > ---
> > > For me this happens when memory_region_iommu_set_page_size_mask()
> > > returns an error because a hotplugged endpoint uses an incompatible page
> > > mask. vfio_connect_container() releases the memory listener which calls
> > > region_del() with the 2^64 IOMMU region. There are probably other ways
> > > to reach this.
> > > ---
> > >  hw/vfio/common.c | 11 +++++++++++
> > >  1 file changed, 11 insertions(+)
> > > 
> > > diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> > > index e66054b02a7..e90a89c389e 100644
> > > --- a/hw/vfio/common.c
> > > +++ b/hw/vfio/common.c
> > > @@ -797,6 +797,17 @@ static void vfio_listener_region_del(MemoryListener *listener,
> > >      }
> > >  
> > >      if (try_unmap) {
> > > +        if (llsize == int128_2_64()) {
> > > +            /* The unmap ioctl doesn't accept a full 64-bit span. */
> > > +            llsize = int128_rshift(llsize, 1);
> > > +            ret = vfio_dma_unmap(container, iova, int128_get64(llsize));
> > > +            if (ret) {
> > > +                error_report("vfio_dma_unmap(%p, 0x%"HWADDR_PRIx", "
> > > +                             "0x%"HWADDR_PRIx") = %d (%m)",
> > > +                             container, iova, int128_get64(llsize), ret);
> > > +            }
> > > +            iova += int128_get64(llsize);
> > > +        }
> > >          ret = vfio_dma_unmap(container, iova, int128_get64(llsize));
> > >          if (ret) {
> > >              error_report("vfio_dma_unmap(%p, 0x%"HWADDR_PRIx", "  
> > 
> > We're still susceptible that splitting the range in two could result in
> > unmap calls that attempt to bisect a mapping that spans both ranges.
> > Both unmap calls would fail in that case.  I think we could solve this
> > more completely with a high water marker, but this probably good enough
> > for now.
> > 
> > Acked-by: Alex Williamson <alex.williamson@redhat.com>  
> 
> 
> Are you merging this then?
> If yes
> 
> Acked-by: Michael S. Tsirkin <mst@redhat.com>
> 

No, the series is focused on virtio-iommu therefore I assumed you or
Eric would merge it, thus I provided an Ack.  Thanks,

Alex
Paolo Bonzini Oct. 30, 2020, 6:19 p.m. UTC | #5
On 30/10/20 18:26, Alex Williamson wrote:
>>  
>>      if (try_unmap) {
>> +        if (llsize == int128_2_64()) {
>> +            /* The unmap ioctl doesn't accept a full 64-bit span. */
>> +            llsize = int128_rshift(llsize, 1);
>> +            ret = vfio_dma_unmap(container, iova, int128_get64(llsize));
>> +            if (ret) {
>> +                error_report("vfio_dma_unmap(%p, 0x%"HWADDR_PRIx", "
>> +                             "0x%"HWADDR_PRIx") = %d (%m)",
>> +                             container, iova, int128_get64(llsize), ret);
>> +            }
>> +            iova += int128_get64(llsize);
>> +        }
>>          ret = vfio_dma_unmap(container, iova, int128_get64(llsize));
>>          if (ret) {
>>              error_report("vfio_dma_unmap(%p, 0x%"HWADDR_PRIx", "  
> We're still susceptible that splitting the range in two could result in
> unmap calls that attempt to bisect a mapping that spans both ranges.
> Both unmap calls would fail in that case.  I think we could solve this
> more completely with a high water marker, but this probably good enough
> for now.
> 
> Acked-by: Alex Williamson <alex.williamson@redhat.com>  

Could it also be fixed by passing an Int128 to vfio_dma_unmap?

Paolo
Alex Williamson Nov. 2, 2020, 5:37 p.m. UTC | #6
On Fri, 30 Oct 2020 19:19:14 +0100
Paolo Bonzini <pbonzini@redhat.com> wrote:

> On 30/10/20 18:26, Alex Williamson wrote:
> >>  
> >>      if (try_unmap) {
> >> +        if (llsize == int128_2_64()) {
> >> +            /* The unmap ioctl doesn't accept a full 64-bit span. */
> >> +            llsize = int128_rshift(llsize, 1);
> >> +            ret = vfio_dma_unmap(container, iova, int128_get64(llsize));
> >> +            if (ret) {
> >> +                error_report("vfio_dma_unmap(%p, 0x%"HWADDR_PRIx", "
> >> +                             "0x%"HWADDR_PRIx") = %d (%m)",
> >> +                             container, iova, int128_get64(llsize), ret);
> >> +            }
> >> +            iova += int128_get64(llsize);
> >> +        }
> >>          ret = vfio_dma_unmap(container, iova, int128_get64(llsize));
> >>          if (ret) {
> >>              error_report("vfio_dma_unmap(%p, 0x%"HWADDR_PRIx", "    
> > We're still susceptible that splitting the range in two could result in
> > unmap calls that attempt to bisect a mapping that spans both ranges.
> > Both unmap calls would fail in that case.  I think we could solve this
> > more completely with a high water marker, but this probably good enough
> > for now.
> > 
> > Acked-by: Alex Williamson <alex.williamson@redhat.com>    
> 
> Could it also be fixed by passing an Int128 to vfio_dma_unmap?

I think we still have the issue at the vfio ioctl which takes __u64 iova
and size parameters, in bytes.  Therefore we cannot unmap an entire
64-bit address space with a single ioctl call.  We'd need to make use
of a flag to modify the ioctl behavior to work terms of some page size
to achieve this, for example if iova and size were in terms of 4K
pages, we wouldn't have this issue.  Thanks,

Alex
Paolo Bonzini Nov. 2, 2020, 5:44 p.m. UTC | #7
On 02/11/20 18:37, Alex Williamson wrote:
> I think we still have the issue at the vfio ioctl which takes __u64 iova
> and size parameters, in bytes.  Therefore we cannot unmap an entire
> 64-bit address space with a single ioctl call.  We'd need to make use
> of a flag to modify the ioctl behavior to work terms of some page size
> to achieve this, for example if iova and size were in terms of 4K
> pages, we wouldn't have this issue.  Thanks,

What happens to the last page if size is unaligned (e.g. iova==0, size==
2^64-1)?

Paolo
Alex Williamson Nov. 2, 2020, 6 p.m. UTC | #8
On Mon, 2 Nov 2020 18:44:11 +0100
Paolo Bonzini <pbonzini@redhat.com> wrote:

> On 02/11/20 18:37, Alex Williamson wrote:
> > I think we still have the issue at the vfio ioctl which takes __u64 iova
> > and size parameters, in bytes.  Therefore we cannot unmap an entire
> > 64-bit address space with a single ioctl call.  We'd need to make use
> > of a flag to modify the ioctl behavior to work terms of some page size
> > to achieve this, for example if iova and size were in terms of 4K
> > pages, we wouldn't have this issue.  Thanks,  
> 
> What happens to the last page if size is unaligned (e.g. iova==0, size==
> 2^64-1)?

Both args are currently required to be aligned to the minimum IOMMU
page size.  Thanks,

Alex
diff mbox series

Patch

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index e66054b02a7..e90a89c389e 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -797,6 +797,17 @@  static void vfio_listener_region_del(MemoryListener *listener,
     }
 
     if (try_unmap) {
+        if (llsize == int128_2_64()) {
+            /* The unmap ioctl doesn't accept a full 64-bit span. */
+            llsize = int128_rshift(llsize, 1);
+            ret = vfio_dma_unmap(container, iova, int128_get64(llsize));
+            if (ret) {
+                error_report("vfio_dma_unmap(%p, 0x%"HWADDR_PRIx", "
+                             "0x%"HWADDR_PRIx") = %d (%m)",
+                             container, iova, int128_get64(llsize), ret);
+            }
+            iova += int128_get64(llsize);
+        }
         ret = vfio_dma_unmap(container, iova, int128_get64(llsize));
         if (ret) {
             error_report("vfio_dma_unmap(%p, 0x%"HWADDR_PRIx", "