diff mbox series

[v8,2/8] xen: mapcache: Unmap first entries in buckets

Message ID 20240529140739.1387692-3-edgar.iglesias@gmail.com (mailing list archive)
State New, archived
Headers show
Series [v8,1/8] xen: mapcache: Make MCACHE_BUCKET_SHIFT runtime configurable | expand

Commit Message

Edgar E. Iglesias May 29, 2024, 2:07 p.m. UTC
From: "Edgar E. Iglesias" <edgar.iglesias@amd.com>

When invalidating memory ranges, if we happen to hit the first
entry in a bucket we were never unmapping it. This was harmless
for foreign mappings but now that we're looking to reuse the
mapcache for transient grant mappings, we must unmap entries
when invalidated.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
---
 hw/xen/xen-mapcache.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

Comments

Anthony PERARD July 1, 2024, 12:55 p.m. UTC | #1
Hi all,

Following this commit, a test which install Debian in a guest with OVMF
as firmware started to fail. QEMU exit with an error when GRUB is
running on the freshly installed Debian (I don't know if GRUB is
starting Linux or not).
The error is:
    Bad ram offset ffffffffffffffff

Some logs:
http://logs.test-lab.xenproject.org/osstest/logs/186611/test-amd64-amd64-xl-qemuu-ovmf-amd64/info.html

Any idea? Something is trying to do something with the address "-1" when
it shouldn't?

Cheers,

Anthony

On Wed, May 29, 2024 at 04:07:33PM +0200, Edgar E. Iglesias wrote:
> From: "Edgar E. Iglesias" <edgar.iglesias@amd.com>
> 
> When invalidating memory ranges, if we happen to hit the first
> entry in a bucket we were never unmapping it. This was harmless
> for foreign mappings but now that we're looking to reuse the
> mapcache for transient grant mappings, we must unmap entries
> when invalidated.
> 
> Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
> ---
>  hw/xen/xen-mapcache.c | 11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/xen/xen-mapcache.c b/hw/xen/xen-mapcache.c
> index bc860f4373..ec95445696 100644
> --- a/hw/xen/xen-mapcache.c
> +++ b/hw/xen/xen-mapcache.c
> @@ -491,18 +491,23 @@ static void xen_invalidate_map_cache_entry_unlocked(MapCache *mc,
>          return;
>      }
>      entry->lock--;
> -    if (entry->lock > 0 || pentry == NULL) {
> +    if (entry->lock > 0) {
>          return;
>      }
>  
> -    pentry->next = entry->next;
>      ram_block_notify_remove(entry->vaddr_base, entry->size, entry->size);
>      if (munmap(entry->vaddr_base, entry->size) != 0) {
>          perror("unmap fails");
>          exit(-1);
>      }
> +
>      g_free(entry->valid_mapping);
> -    g_free(entry);
> +    if (pentry) {
> +        pentry->next = entry->next;
> +        g_free(entry);
> +    } else {
> +        memset(entry, 0, sizeof *entry);
> +    }
>  }
>  
>  typedef struct XenMapCacheData {
> -- 
> 2.40.1
> 
>
Edgar E. Iglesias July 1, 2024, 1:58 p.m. UTC | #2
On Mon, Jul 1, 2024 at 2:55 PM Anthony PERARD <anthony.perard@vates.tech>
wrote:

> Hi all,
>
> Following this commit, a test which install Debian in a guest with OVMF
> as firmware started to fail. QEMU exit with an error when GRUB is
> running on the freshly installed Debian (I don't know if GRUB is
> starting Linux or not).
> The error is:
>     Bad ram offset ffffffffffffffff
>
> Some logs:
>
> http://logs.test-lab.xenproject.org/osstest/logs/186611/test-amd64-amd64-xl-qemuu-ovmf-amd64/info.html
>
> Any idea? Something is trying to do something with the address "-1" when
> it shouldn't?
>
>
Hi Anothny,

Yes, it looks like something is calling qemu_get_ram_block() on something
that isn't mapped.
One possible path is in qemu_ram_block_from_host() but there may be others.

The following patch may help.
Any chance you could try to get a backtrace from QEMU when it failed?

diff --git a/system/physmem.c b/system/physmem.c
index 33d09f7571..2669c4dbbb 100644
--- a/system/physmem.c
+++ b/system/physmem.c
@@ -2277,6 +2277,9 @@ RAMBlock *qemu_ram_block_from_host(void *ptr, bool
round_offset,
         ram_addr_t ram_addr;
         RCU_READ_LOCK_GUARD();
         ram_addr = xen_ram_addr_from_mapcache(ptr);
+        if (ram_addr == RAM_ADDR_INVALID) {
+            return NULL;
+        }
         block = qemu_get_ram_block(ram_addr);
         if (block) {
             *offset = ram_addr - block->offset;





> Cheers,
>
> Anthony
>
> On Wed, May 29, 2024 at 04:07:33PM +0200, Edgar E. Iglesias wrote:
> > From: "Edgar E. Iglesias" <edgar.iglesias@amd.com>
> >
> > When invalidating memory ranges, if we happen to hit the first
> > entry in a bucket we were never unmapping it. This was harmless
> > for foreign mappings but now that we're looking to reuse the
> > mapcache for transient grant mappings, we must unmap entries
> > when invalidated.
> >
> > Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
> > Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
> > ---
> >  hw/xen/xen-mapcache.c | 11 ++++++++---
> >  1 file changed, 8 insertions(+), 3 deletions(-)
> >
> > diff --git a/hw/xen/xen-mapcache.c b/hw/xen/xen-mapcache.c
> > index bc860f4373..ec95445696 100644
> > --- a/hw/xen/xen-mapcache.c
> > +++ b/hw/xen/xen-mapcache.c
> > @@ -491,18 +491,23 @@ static void
> xen_invalidate_map_cache_entry_unlocked(MapCache *mc,
> >          return;
> >      }
> >      entry->lock--;
> > -    if (entry->lock > 0 || pentry == NULL) {
> > +    if (entry->lock > 0) {
> >          return;
> >      }
> >
> > -    pentry->next = entry->next;
> >      ram_block_notify_remove(entry->vaddr_base, entry->size,
> entry->size);
> >      if (munmap(entry->vaddr_base, entry->size) != 0) {
> >          perror("unmap fails");
> >          exit(-1);
> >      }
> > +
> >      g_free(entry->valid_mapping);
> > -    g_free(entry);
> > +    if (pentry) {
> > +        pentry->next = entry->next;
> > +        g_free(entry);
> > +    } else {
> > +        memset(entry, 0, sizeof *entry);
> > +    }
> >  }
> >
> >  typedef struct XenMapCacheData {
> > --
> > 2.40.1
> >
> >
> --
>
> Anthony Perard | Vates XCP-ng Developer
>
> XCP-ng & Xen Orchestra - Vates solutions
>
> web: https://vates.tech
>
Edgar E. Iglesias July 1, 2024, 2:30 p.m. UTC | #3
On Mon, Jul 1, 2024 at 3:58 PM Edgar E. Iglesias <edgar.iglesias@gmail.com>
wrote:

> On Mon, Jul 1, 2024 at 2:55 PM Anthony PERARD <anthony.perard@vates.tech>
> wrote:
>
>> Hi all,
>>
>> Following this commit, a test which install Debian in a guest with OVMF
>> as firmware started to fail. QEMU exit with an error when GRUB is
>> running on the freshly installed Debian (I don't know if GRUB is
>> starting Linux or not).
>> The error is:
>>     Bad ram offset ffffffffffffffff
>>
>> Some logs:
>>
>> http://logs.test-lab.xenproject.org/osstest/logs/186611/test-amd64-amd64-xl-qemuu-ovmf-amd64/info.html
>>
>> Any idea? Something is trying to do something with the address "-1" when
>> it shouldn't?
>>
>>
> Hi Anothny,
>
> Yes, it looks like something is calling qemu_get_ram_block() on something
> that isn't mapped.
> One possible path is in qemu_ram_block_from_host() but there may be others.
>
> The following patch may help.
> Any chance you could try to get a backtrace from QEMU when it failed?
>
> diff --git a/system/physmem.c b/system/physmem.c
> index 33d09f7571..2669c4dbbb 100644
> --- a/system/physmem.c
> +++ b/system/physmem.c
> @@ -2277,6 +2277,9 @@ RAMBlock *qemu_ram_block_from_host(void *ptr, bool
> round_offset,
>          ram_addr_t ram_addr;
>          RCU_READ_LOCK_GUARD();
>          ram_addr = xen_ram_addr_from_mapcache(ptr);
> +        if (ram_addr == RAM_ADDR_INVALID) {
> +            return NULL;
> +        }
>          block = qemu_get_ram_block(ram_addr);
>          if (block) {
>              *offset = ram_addr - block->offset;
>
>
>
One more thing, regarding this specific patch. I don't think we should
clear the
entire entry, the next field should be kept, otherwise we'll disconnect
following
mappings that will never be found again. IIUC, this could very well be
causing the problem you see.

Does the following make sense?

diff --git a/hw/xen/xen-mapcache.c b/hw/xen/xen-mapcache.c
index 5f23b0adbe..e9df53c19d 100644
--- a/hw/xen/xen-mapcache.c
+++ b/hw/xen/xen-mapcache.c
@@ -597,7 +597,14 @@ static void
xen_invalidate_map_cache_entry_unlocked(MapCache *mc,
         pentry->next = entry->next;
         g_free(entry);
     } else {
-        memset(entry, 0, sizeof *entry);
+        /* Invalidate mapping.  */
+        entry->paddr_index = 0;
+        entry->vaddr_base = NULL;
+        entry->size = 0;
+        g_free(entry->valid_mapping);
+        entry->valid_mapping = NULL;
+        entry->flags = 0;
+        /* Keep entry->next pointing to the rest of the list.  */
     }
 }









>
>
>> Cheers,
>>
>> Anthony
>>
>> On Wed, May 29, 2024 at 04:07:33PM +0200, Edgar E. Iglesias wrote:
>> > From: "Edgar E. Iglesias" <edgar.iglesias@amd.com>
>> >
>> > When invalidating memory ranges, if we happen to hit the first
>> > entry in a bucket we were never unmapping it. This was harmless
>> > for foreign mappings but now that we're looking to reuse the
>> > mapcache for transient grant mappings, we must unmap entries
>> > when invalidated.
>> >
>> > Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
>> > Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
>> > ---
>> >  hw/xen/xen-mapcache.c | 11 ++++++++---
>> >  1 file changed, 8 insertions(+), 3 deletions(-)
>> >
>> > diff --git a/hw/xen/xen-mapcache.c b/hw/xen/xen-mapcache.c
>> > index bc860f4373..ec95445696 100644
>> > --- a/hw/xen/xen-mapcache.c
>> > +++ b/hw/xen/xen-mapcache.c
>> > @@ -491,18 +491,23 @@ static void
>> xen_invalidate_map_cache_entry_unlocked(MapCache *mc,
>> >          return;
>> >      }
>> >      entry->lock--;
>> > -    if (entry->lock > 0 || pentry == NULL) {
>> > +    if (entry->lock > 0) {
>> >          return;
>> >      }
>> >
>> > -    pentry->next = entry->next;
>> >      ram_block_notify_remove(entry->vaddr_base, entry->size,
>> entry->size);
>> >      if (munmap(entry->vaddr_base, entry->size) != 0) {
>> >          perror("unmap fails");
>> >          exit(-1);
>> >      }
>> > +
>> >      g_free(entry->valid_mapping);
>> > -    g_free(entry);
>> > +    if (pentry) {
>> > +        pentry->next = entry->next;
>> > +        g_free(entry);
>> > +    } else {
>> > +        memset(entry, 0, sizeof *entry);
>> > +    }
>> >  }
>> >
>> >  typedef struct XenMapCacheData {
>> > --
>> > 2.40.1
>> >
>> >
>> --
>>
>> Anthony Perard | Vates XCP-ng Developer
>>
>> XCP-ng & Xen Orchestra - Vates solutions
>>
>> web: https://vates.tech
>>
>
Edgar E. Iglesias July 1, 2024, 2:34 p.m. UTC | #4
On Mon, Jul 1, 2024 at 4:30 PM Edgar E. Iglesias <edgar.iglesias@gmail.com>
wrote:

>
>
> On Mon, Jul 1, 2024 at 3:58 PM Edgar E. Iglesias <edgar.iglesias@gmail.com>
> wrote:
>
>> On Mon, Jul 1, 2024 at 2:55 PM Anthony PERARD <anthony.perard@vates.tech>
>> wrote:
>>
>>> Hi all,
>>>
>>> Following this commit, a test which install Debian in a guest with OVMF
>>> as firmware started to fail. QEMU exit with an error when GRUB is
>>> running on the freshly installed Debian (I don't know if GRUB is
>>> starting Linux or not).
>>> The error is:
>>>     Bad ram offset ffffffffffffffff
>>>
>>> Some logs:
>>>
>>> http://logs.test-lab.xenproject.org/osstest/logs/186611/test-amd64-amd64-xl-qemuu-ovmf-amd64/info.html
>>>
>>> Any idea? Something is trying to do something with the address "-1" when
>>> it shouldn't?
>>>
>>>
>> Hi Anothny,
>>
>> Yes, it looks like something is calling qemu_get_ram_block() on something
>> that isn't mapped.
>> One possible path is in qemu_ram_block_from_host() but there may be
>> others.
>>
>> The following patch may help.
>> Any chance you could try to get a backtrace from QEMU when it failed?
>>
>> diff --git a/system/physmem.c b/system/physmem.c
>> index 33d09f7571..2669c4dbbb 100644
>> --- a/system/physmem.c
>> +++ b/system/physmem.c
>> @@ -2277,6 +2277,9 @@ RAMBlock *qemu_ram_block_from_host(void *ptr, bool
>> round_offset,
>>          ram_addr_t ram_addr;
>>          RCU_READ_LOCK_GUARD();
>>          ram_addr = xen_ram_addr_from_mapcache(ptr);
>> +        if (ram_addr == RAM_ADDR_INVALID) {
>> +            return NULL;
>> +        }
>>          block = qemu_get_ram_block(ram_addr);
>>          if (block) {
>>              *offset = ram_addr - block->offset;
>>
>>
>>
> One more thing, regarding this specific patch. I don't think we should
> clear the
> entire entry, the next field should be kept, otherwise we'll disconnect
> following
> mappings that will never be found again. IIUC, this could very well be
> causing the problem you see.
>
> Does the following make sense?
>
> diff --git a/hw/xen/xen-mapcache.c b/hw/xen/xen-mapcache.c
> index 5f23b0adbe..e9df53c19d 100644
> --- a/hw/xen/xen-mapcache.c
> +++ b/hw/xen/xen-mapcache.c
> @@ -597,7 +597,14 @@ static void
> xen_invalidate_map_cache_entry_unlocked(MapCache *mc,
>          pentry->next = entry->next;
>          g_free(entry);
>      } else {
> -        memset(entry, 0, sizeof *entry);
> +        /* Invalidate mapping.  */
> +        entry->paddr_index = 0;
> +        entry->vaddr_base = NULL;
> +        entry->size = 0;
> +        g_free(entry->valid_mapping);
> +        entry->valid_mapping = NULL;
> +        entry->flags = 0;
> +        /* Keep entry->next pointing to the rest of the list.  */
>      }
>  }
>
>
>

And here without double-freeing entry->valid_mapping:

diff --git a/hw/xen/xen-mapcache.c b/hw/xen/xen-mapcache.c
index 5f23b0adbe..667807b3b6 100644
--- a/hw/xen/xen-mapcache.c
+++ b/hw/xen/xen-mapcache.c
@@ -597,7 +597,13 @@ static void
xen_invalidate_map_cache_entry_unlocked(MapCache *mc,
         pentry->next = entry->next;
         g_free(entry);
     } else {
-        memset(entry, 0, sizeof *entry);
+        /* Invalidate mapping.  */
+        entry->paddr_index = 0;
+        entry->vaddr_base = NULL;
+        entry->size = 0;
+        entry->valid_mapping = NULL;
+        entry->flags = 0;
+        /* Keep entry->next pointing to the rest of the list.  */
     }
 }
Anthony PERARD July 1, 2024, 4:21 p.m. UTC | #5
On Mon, Jul 01, 2024 at 04:34:53PM +0200, Edgar E. Iglesias wrote:
> On Mon, Jul 1, 2024 at 4:30 PM Edgar E. Iglesias <edgar.iglesias@gmail.com>
> wrote:
> > On Mon, Jul 1, 2024 at 3:58 PM Edgar E. Iglesias <edgar.iglesias@gmail.com>
> > wrote:
> >> Any chance you could try to get a backtrace from QEMU when it failed?

Here it is:


#3  0x00007fa8762f4472 in __GI_abort () at ./stdlib/abort.c:79
        save_stage = 1
        act = {__sigaction_handler = {sa_handler = 0x20, sa_sigaction = 0x20}, sa_mask = {__val = {94603440166168, 18446744073709551615, 94603406369640, 0, 0, 94603406346720, 94603440166168, 140361486774256, 0, 140361486773376, 94603401285536, 140361496232688, 94603440166096, 140361486773456, 94603401289576, 140360849280256}}, sa_flags = -1804462896, sa_restorer = 0x748f2d40}
#4  0x0000560a92230f0d in qemu_get_ram_block (addr=18446744073709551615) at ../system/physmem.c:801
        block = 0x0
#5  0x0000560a922350ab in qemu_ram_block_from_host (ptr=0x7fa84e8fcd00, round_offset=false, offset=0x7fa8748f2de8) at ../system/physmem.c:2280
        ram_addr = 18446744073709551615
        _rcu_read_auto = 0x1
        block = 0x0
        host = 0x7fa84e8fcd00 ""
        _rcu_read_auto = 0x7fa8751f8288
#6  0x0000560a92229669 in memory_region_from_host (ptr=0x7fa84e8fcd00, offset=0x7fa8748f2de8) at ../system/memory.c:2440
        block = 0x0
#7  0x0000560a92237418 in address_space_unmap (as=0x560a94b05a20, buffer=0x7fa84e8fcd00, len=32768, is_write=true, access_len=32768) at ../system/physmem.c:3246
        mr = 0x0
        addr1 = 0
        __PRETTY_FUNCTION__ = "address_space_unmap"
#8  0x0000560a91fd6cd3 in dma_memory_unmap (as=0x560a94b05a20, buffer=0x7fa84e8fcd00, len=32768, dir=DMA_DIRECTION_FROM_DEVICE, access_len=32768) at /root/build/qemu/include/sysemu/dma.h:236
#9  0x0000560a91fd763d in dma_blk_unmap (dbs=0x560a94d87400) at ../system/dma-helpers.c:93
        i = 1
#10 0x0000560a91fd76e6 in dma_complete (dbs=0x560a94d87400, ret=0) at ../system/dma-helpers.c:105
        __PRETTY_FUNCTION__ = "dma_complete"
#11 0x0000560a91fd781c in dma_blk_cb (opaque=0x560a94d87400, ret=0) at ../system/dma-helpers.c:129
        dbs = 0x560a94d87400
        ctx = 0x560a9448da90
        cur_addr = 0
        cur_len = 0
        mem = 0x0
        __PRETTY_FUNCTION__ = "dma_blk_cb"
#12 0x0000560a9232e974 in blk_aio_complete (acb=0x560a9448d5f0) at ../block/block-backend.c:1555
#13 0x0000560a9232ebd1 in blk_aio_read_entry (opaque=0x560a9448d5f0) at ../block/block-backend.c:1610
        acb = 0x560a9448d5f0
        rwco = 0x560a9448d618
        qiov = 0x560a94d87460
        __PRETTY_FUNCTION__ = "blk_aio_read_entry"

> > One more thing, regarding this specific patch. I don't think we should
> > clear the
> > entire entry, the next field should be kept, otherwise we'll disconnect
> > following
> > mappings that will never be found again. IIUC, this could very well be
> > causing the problem you see.
> >
> > Does the following make sense?
> >
> And here without double-freeing entry->valid_mapping:
> 
> diff --git a/hw/xen/xen-mapcache.c b/hw/xen/xen-mapcache.c
> index 5f23b0adbe..667807b3b6 100644
> --- a/hw/xen/xen-mapcache.c
> +++ b/hw/xen/xen-mapcache.c
> @@ -597,7 +597,13 @@ static void
> xen_invalidate_map_cache_entry_unlocked(MapCache *mc,
>          pentry->next = entry->next;
>          g_free(entry);
>      } else {
> -        memset(entry, 0, sizeof *entry);
> +        /* Invalidate mapping.  */
> +        entry->paddr_index = 0;
> +        entry->vaddr_base = NULL;
> +        entry->size = 0;
> +        entry->valid_mapping = NULL;
> +        entry->flags = 0;
> +        /* Keep entry->next pointing to the rest of the list.  */
>      }
>  }

I've tried this patch, and that fix the issue I've seen. I'll run more
tests on it, just in case, but there's no reason that would break
something else.

Cheers,
Edgar E. Iglesias July 1, 2024, 9:28 p.m. UTC | #6
On Mon, Jul 1, 2024 at 6:21 PM Anthony PERARD <anthony.perard@vates.tech>
wrote:

> On Mon, Jul 01, 2024 at 04:34:53PM +0200, Edgar E. Iglesias wrote:
> > On Mon, Jul 1, 2024 at 4:30 PM Edgar E. Iglesias <
> edgar.iglesias@gmail.com>
> > wrote:
> > > On Mon, Jul 1, 2024 at 3:58 PM Edgar E. Iglesias <
> edgar.iglesias@gmail.com>
> > > wrote:
> > >> Any chance you could try to get a backtrace from QEMU when it failed?
>
> Here it is:
>
>
> #3  0x00007fa8762f4472 in __GI_abort () at ./stdlib/abort.c:79
>         save_stage = 1
>         act = {__sigaction_handler = {sa_handler = 0x20, sa_sigaction =
> 0x20}, sa_mask = {__val = {94603440166168, 18446744073709551615,
> 94603406369640, 0, 0, 94603406346720, 94603440166168, 140361486774256, 0,
> 140361486773376, 94603401285536, 140361496232688, 94603440166096,
> 140361486773456, 94603401289576, 140360849280256}}, sa_flags = -1804462896,
> sa_restorer = 0x748f2d40}
> #4  0x0000560a92230f0d in qemu_get_ram_block (addr=18446744073709551615)
> at ../system/physmem.c:801
>         block = 0x0
> #5  0x0000560a922350ab in qemu_ram_block_from_host (ptr=0x7fa84e8fcd00,
> round_offset=false, offset=0x7fa8748f2de8) at ../system/physmem.c:2280
>         ram_addr = 18446744073709551615
>         _rcu_read_auto = 0x1
>         block = 0x0
>         host = 0x7fa84e8fcd00 ""
>         _rcu_read_auto = 0x7fa8751f8288
> #6  0x0000560a92229669 in memory_region_from_host (ptr=0x7fa84e8fcd00,
> offset=0x7fa8748f2de8) at ../system/memory.c:2440
>         block = 0x0
> #7  0x0000560a92237418 in address_space_unmap (as=0x560a94b05a20,
> buffer=0x7fa84e8fcd00, len=32768, is_write=true, access_len=32768) at
> ../system/physmem.c:3246
>         mr = 0x0
>         addr1 = 0
>         __PRETTY_FUNCTION__ = "address_space_unmap"
> #8  0x0000560a91fd6cd3 in dma_memory_unmap (as=0x560a94b05a20,
> buffer=0x7fa84e8fcd00, len=32768, dir=DMA_DIRECTION_FROM_DEVICE,
> access_len=32768) at /root/build/qemu/include/sysemu/dma.h:236
> #9  0x0000560a91fd763d in dma_blk_unmap (dbs=0x560a94d87400) at
> ../system/dma-helpers.c:93
>         i = 1
> #10 0x0000560a91fd76e6 in dma_complete (dbs=0x560a94d87400, ret=0) at
> ../system/dma-helpers.c:105
>         __PRETTY_FUNCTION__ = "dma_complete"
> #11 0x0000560a91fd781c in dma_blk_cb (opaque=0x560a94d87400, ret=0) at
> ../system/dma-helpers.c:129
>         dbs = 0x560a94d87400
>         ctx = 0x560a9448da90
>         cur_addr = 0
>         cur_len = 0
>         mem = 0x0
>         __PRETTY_FUNCTION__ = "dma_blk_cb"
> #12 0x0000560a9232e974 in blk_aio_complete (acb=0x560a9448d5f0) at
> ../block/block-backend.c:1555
> #13 0x0000560a9232ebd1 in blk_aio_read_entry (opaque=0x560a9448d5f0) at
> ../block/block-backend.c:1610
>         acb = 0x560a9448d5f0
>         rwco = 0x560a9448d618
>         qiov = 0x560a94d87460
>         __PRETTY_FUNCTION__ = "blk_aio_read_entry"
>
> > > One more thing, regarding this specific patch. I don't think we should
> > > clear the
> > > entire entry, the next field should be kept, otherwise we'll disconnect
> > > following
> > > mappings that will never be found again. IIUC, this could very well be
> > > causing the problem you see.
> > >
> > > Does the following make sense?
> > >
> > And here without double-freeing entry->valid_mapping:
> >
> > diff --git a/hw/xen/xen-mapcache.c b/hw/xen/xen-mapcache.c
> > index 5f23b0adbe..667807b3b6 100644
> > --- a/hw/xen/xen-mapcache.c
> > +++ b/hw/xen/xen-mapcache.c
> > @@ -597,7 +597,13 @@ static void
> > xen_invalidate_map_cache_entry_unlocked(MapCache *mc,
> >          pentry->next = entry->next;
> >          g_free(entry);
> >      } else {
> > -        memset(entry, 0, sizeof *entry);
> > +        /* Invalidate mapping.  */
> > +        entry->paddr_index = 0;
> > +        entry->vaddr_base = NULL;
> > +        entry->size = 0;
> > +        entry->valid_mapping = NULL;
> > +        entry->flags = 0;
> > +        /* Keep entry->next pointing to the rest of the list.  */
> >      }
> >  }
>
> I've tried this patch, and that fix the issue I've seen. I'll run more
> tests on it, just in case, but there's no reason that would break
> something else.
>
> Cheers,
>

Thanks Anthony, I'll send out a proper patch tomorrow.

Cheers,
Edgar
diff mbox series

Patch

diff --git a/hw/xen/xen-mapcache.c b/hw/xen/xen-mapcache.c
index bc860f4373..ec95445696 100644
--- a/hw/xen/xen-mapcache.c
+++ b/hw/xen/xen-mapcache.c
@@ -491,18 +491,23 @@  static void xen_invalidate_map_cache_entry_unlocked(MapCache *mc,
         return;
     }
     entry->lock--;
-    if (entry->lock > 0 || pentry == NULL) {
+    if (entry->lock > 0) {
         return;
     }
 
-    pentry->next = entry->next;
     ram_block_notify_remove(entry->vaddr_base, entry->size, entry->size);
     if (munmap(entry->vaddr_base, entry->size) != 0) {
         perror("unmap fails");
         exit(-1);
     }
+
     g_free(entry->valid_mapping);
-    g_free(entry);
+    if (pentry) {
+        pentry->next = entry->next;
+        g_free(entry);
+    } else {
+        memset(entry, 0, sizeof *entry);
+    }
 }
 
 typedef struct XenMapCacheData {