diff mbox

[v2,1/1] qemu/migration: fix the double free problem on from_src_file

Message ID 20170606052438.35405-2-haoqf@linux.vnet.ibm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Hao QingFeng June 6, 2017, 5:24 a.m. UTC
In load_snapshot, mis->from_src_file is freed twice, the first free is by
qemu_fclose, the second is by migration_incoming_state_destroy and
it causes Illegal instruction exception. The fix is just to remove the
first free.

This problem is found by qemu-iotests case 068 since commit
"660819b migration: shut src return path unconditionally". The error is:
068 1s ... - output mismatch (see 068.out.bad)
    --- tests/qemu-iotests/068.out	2017-05-06 01:00:26.417270437 +0200
    +++ 068.out.bad	2017-06-03 13:59:55.360274640 +0200
    @@ -6,6 +6,8 @@
     QEMU X.Y.Z monitor - type 'help' for more information
     (qemu) savevm 0
     (qemu) quit
    +./common.config: line 107: 242472 Illegal instruction     (core dumped) ( if [ -n "${QEMU_NEED_PID}" ]; then
    +    echo $BASHPID > "${QEMU_TEST_DIR}/qemu-${_QEMU_HANDLE}.pid";
    +fi; exec "$QEMU_PROG" $QEMU_OPTIONS "$@" )
     QEMU X.Y.Z monitor - type 'help' for more information
    -(qemu) quit
    -*** done
    +(qemu) *** done

Signed-off-by: QingFeng Hao <haoqf@linux.vnet.ibm.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
---
 migration/savevm.c | 1 -
 1 file changed, 1 deletion(-)

Comments

Kevin Wolf June 6, 2017, 12:49 p.m. UTC | #1
Am 06.06.2017 um 07:24 hat QingFeng Hao geschrieben:
> In load_snapshot, mis->from_src_file is freed twice, the first free is by
> qemu_fclose, the second is by migration_incoming_state_destroy and
> it causes Illegal instruction exception. The fix is just to remove the
> first free.
> 
> This problem is found by qemu-iotests case 068 since commit
> "660819b migration: shut src return path unconditionally". The error is:
> 068 1s ... - output mismatch (see 068.out.bad)
>     --- tests/qemu-iotests/068.out	2017-05-06 01:00:26.417270437 +0200
>     +++ 068.out.bad	2017-06-03 13:59:55.360274640 +0200
>     @@ -6,6 +6,8 @@
>      QEMU X.Y.Z monitor - type 'help' for more information
>      (qemu) savevm 0
>      (qemu) quit
>     +./common.config: line 107: 242472 Illegal instruction     (core dumped) ( if [ -n "${QEMU_NEED_PID}" ]; then
>     +    echo $BASHPID > "${QEMU_TEST_DIR}/qemu-${_QEMU_HANDLE}.pid";
>     +fi; exec "$QEMU_PROG" $QEMU_OPTIONS "$@" )
>      QEMU X.Y.Z monitor - type 'help' for more information
>     -(qemu) quit
>     -*** done
>     +(qemu) *** done
> 
> Signed-off-by: QingFeng Hao <haoqf@linux.vnet.ibm.com>
> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Reviewed-by: Peter Xu <peterx@redhat.com>

Dave, as you only gave R-b rather than merging the patch, should this be
merged through the block tree?

> diff --git a/migration/savevm.c b/migration/savevm.c
> index 9c320f59d0..853e14e34e 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -2290,7 +2290,6 @@ int load_snapshot(const char *name, Error **errp)
>  
>      aio_context_acquire(aio_context);
>      ret = qemu_loadvm_state(f);
> -    qemu_fclose(f);
>      aio_context_release(aio_context);
>  
>      migration_incoming_state_destroy();

Did we check other callers of migration_incoming_state_destroy()?

For example, qmp_xen_load_devices_state() looks suspicious, too.

I can't tell for postcopy_ram_listen_thread() - commit 660819b didn't
seem to remove a qemu_fclose() call there, but I can't see one left
behind either. Was the file leaked before commit 660819b or am I
missing something?

Kevin
Dr. David Alan Gilbert June 6, 2017, 5:42 p.m. UTC | #2
* Kevin Wolf (kwolf@redhat.com) wrote:
> Am 06.06.2017 um 07:24 hat QingFeng Hao geschrieben:
> > In load_snapshot, mis->from_src_file is freed twice, the first free is by
> > qemu_fclose, the second is by migration_incoming_state_destroy and
> > it causes Illegal instruction exception. The fix is just to remove the
> > first free.
> > 
> > This problem is found by qemu-iotests case 068 since commit
> > "660819b migration: shut src return path unconditionally". The error is:
> > 068 1s ... - output mismatch (see 068.out.bad)
> >     --- tests/qemu-iotests/068.out	2017-05-06 01:00:26.417270437 +0200
> >     +++ 068.out.bad	2017-06-03 13:59:55.360274640 +0200
> >     @@ -6,6 +6,8 @@
> >      QEMU X.Y.Z monitor - type 'help' for more information
> >      (qemu) savevm 0
> >      (qemu) quit
> >     +./common.config: line 107: 242472 Illegal instruction     (core dumped) ( if [ -n "${QEMU_NEED_PID}" ]; then
> >     +    echo $BASHPID > "${QEMU_TEST_DIR}/qemu-${_QEMU_HANDLE}.pid";
> >     +fi; exec "$QEMU_PROG" $QEMU_OPTIONS "$@" )
> >      QEMU X.Y.Z monitor - type 'help' for more information
> >     -(qemu) quit
> >     -*** done
> >     +(qemu) *** done
> > 
> > Signed-off-by: QingFeng Hao <haoqf@linux.vnet.ibm.com>
> > Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > Reviewed-by: Peter Xu <peterx@redhat.com>
> 
> Dave, as you only gave R-b rather than merging the patch, should this be
> merged through the block tree?

I'm happy for it to go via block but also happy for it to go via
migration; Juan is mostly doing the migration set at the moment since
they're dominated by his cleanups.

> > diff --git a/migration/savevm.c b/migration/savevm.c
> > index 9c320f59d0..853e14e34e 100644
> > --- a/migration/savevm.c
> > +++ b/migration/savevm.c
> > @@ -2290,7 +2290,6 @@ int load_snapshot(const char *name, Error **errp)
> >  
> >      aio_context_acquire(aio_context);
> >      ret = qemu_loadvm_state(f);
> > -    qemu_fclose(f);
> >      aio_context_release(aio_context);
> >  
> >      migration_incoming_state_destroy();
> 
> Did we check other callers of migration_incoming_state_destroy()?
> 
> For example, qmp_xen_load_devices_state() looks suspicious, too.

Hmm, it looks suspicious in the opposite direction; it never sets
mis->from_src_file as was added by b4b076da into the load_snapshot path.

> I can't tell for postcopy_ram_listen_thread() - commit 660819b didn't
> seem to remove a qemu_fclose() call there, but I can't see one left
> behind either. Was the file leaked before commit 660819b or am I
> missing something?

I don't think there's a problem in the postcopy path, although hmm was
I missing a close before?

Dave
> 
> Kevin
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
Juan Quintela June 6, 2017, 5:57 p.m. UTC | #3
Kevin Wolf <kwolf@redhat.com> wrote:
> Am 06.06.2017 um 07:24 hat QingFeng Hao geschrieben:
>> In load_snapshot, mis->from_src_file is freed twice, the first free is by
>> qemu_fclose, the second is by migration_incoming_state_destroy and
>> it causes Illegal instruction exception. The fix is just to remove the
>> first free.
>> 
>> This problem is found by qemu-iotests case 068 since commit
>> "660819b migration: shut src return path unconditionally". The error is:
>> 068 1s ... - output mismatch (see 068.out.bad)
>>     --- tests/qemu-iotests/068.out	2017-05-06 01:00:26.417270437 +0200
>>     +++ 068.out.bad	2017-06-03 13:59:55.360274640 +0200
>>     @@ -6,6 +6,8 @@
>>      QEMU X.Y.Z monitor - type 'help' for more information
>>      (qemu) savevm 0
>>      (qemu) quit
>>     +./common.config: line 107: 242472 Illegal instruction     (core dumped) ( if [ -n "${QEMU_NEED_PID}" ]; then
>>     +    echo $BASHPID > "${QEMU_TEST_DIR}/qemu-${_QEMU_HANDLE}.pid";
>>     +fi; exec "$QEMU_PROG" $QEMU_OPTIONS "$@" )
>>      QEMU X.Y.Z monitor - type 'help' for more information
>>     -(qemu) quit
>>     -*** done
>>     +(qemu) *** done
>> 
>> Signed-off-by: QingFeng Hao <haoqf@linux.vnet.ibm.com>
>> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>> Reviewed-by: Peter Xu <peterx@redhat.com>
>
> Dave, as you only gave R-b rather than merging the patch, should this be
> merged through the block tree?

I got it.  I will send a pull request later Today.


>
> Did we check other callers of migration_incoming_state_destroy()?
>
> For example, qmp_xen_load_devices_state() looks suspicious, too.
>
> I can't tell for postcopy_ram_listen_thread() - commit 660819b didn't
> seem to remove a qemu_fclose() call there, but I can't see one left
> behind either. Was the file leaked before commit 660819b or am I
> missing something?

I will take a look at those.

Thanks, Juan.
Hao QingFeng June 7, 2017, 3:18 a.m. UTC | #4
在 2017/6/6 20:49, Kevin Wolf 写道:
> Am 06.06.2017 um 07:24 hat QingFeng Hao geschrieben:
>> In load_snapshot, mis->from_src_file is freed twice, the first free is by
>> qemu_fclose, the second is by migration_incoming_state_destroy and
>> it causes Illegal instruction exception. The fix is just to remove the
>> first free.
>>
>> This problem is found by qemu-iotests case 068 since commit
>> "660819b migration: shut src return path unconditionally". The error is:
>> 068 1s ... - output mismatch (see 068.out.bad)
>>      --- tests/qemu-iotests/068.out	2017-05-06 01:00:26.417270437 +0200
>>      +++ 068.out.bad	2017-06-03 13:59:55.360274640 +0200
>>      @@ -6,6 +6,8 @@
>>       QEMU X.Y.Z monitor - type 'help' for more information
>>       (qemu) savevm 0
>>       (qemu) quit
>>      +./common.config: line 107: 242472 Illegal instruction     (core dumped) ( if [ -n "${QEMU_NEED_PID}" ]; then
>>      +    echo $BASHPID > "${QEMU_TEST_DIR}/qemu-${_QEMU_HANDLE}.pid";
>>      +fi; exec "$QEMU_PROG" $QEMU_OPTIONS "$@" )
>>       QEMU X.Y.Z monitor - type 'help' for more information
>>      -(qemu) quit
>>      -*** done
>>      +(qemu) *** done
>>
>> Signed-off-by: QingFeng Hao <haoqf@linux.vnet.ibm.com>
>> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>> Reviewed-by: Peter Xu <peterx@redhat.com>
> Dave, as you only gave R-b rather than merging the patch, should this be
> merged through the block tree?
>
>> diff --git a/migration/savevm.c b/migration/savevm.c
>> index 9c320f59d0..853e14e34e 100644
>> --- a/migration/savevm.c
>> +++ b/migration/savevm.c
>> @@ -2290,7 +2290,6 @@ int load_snapshot(const char *name, Error **errp)
>>   
>>       aio_context_acquire(aio_context);
>>       ret = qemu_loadvm_state(f);
>> -    qemu_fclose(f);
>>       aio_context_release(aio_context);
>>   
>>       migration_incoming_state_destroy();
> Did we check other callers of migration_incoming_state_destroy()?
>
> For example, qmp_xen_load_devices_state() looks suspicious, too.
Good reminder! Yes, I checked it and there is no assignment of 
from_src_file there and f
is opened locally, so I think that qemu_fclose doesn't impact 
migration_incoming_state_destroy.
migration_incoming_state_destroy is called in 4 places:
process_incoming_migration_bh, postcopy_ram_listen_thread, 
qmp_xen_load_devices_state
and load_snapshot. process_incoming_migration_bh is launched by 
process_incoming_migration_co
whose qemu_fclose is removed by commit 660819b.
For postcopy_ram_listen_thread, I didn't see where it calls qemu_fclose.
Actually to simplify the check for the problem, I just searched where 
from_src_file
is assigned to and got 2 places: process_incoming_migration_co and 
load_snapshot.
qemu_fclose in the first function is removed by commit 660819b, and 
qemu_fclose in the
second is removed by this one. I think a potential risk might be opaque 
is closed
by anywhere else than process_incoming_migration_co, but there is legacy 
qemu_close
before commit 660819b, so the risk might be low? thanks :)
>
> I can't tell for postcopy_ram_listen_thread() - commit 660819b didn't
> seem to remove a qemu_fclose() call there, but I can't see one left
> behind either. Was the file leaked before commit 660819b or am I
> missing something?
I don't think so because loadvm_postcopy_handle_listen creates thread 
postcopy_ram_listen_thread
and passes mis->from_src_file as its arg, which will be closed by 
migration_incoming_state_destroy.
What confuses me is in the series function calls of 
qemu_loadvm_state_main etc, argument f looks
to be redundant as mis already contains from_src_file which equals to f. 
Furthermore, mis may be
also redundant as it can be got via migration_incoming_get_current. Thanks!
>
> Kevin
>
Peter Xu June 7, 2017, 3:29 a.m. UTC | #5
On Tue, Jun 06, 2017 at 06:42:18PM +0100, Dr. David Alan Gilbert wrote:
> * Kevin Wolf (kwolf@redhat.com) wrote:
> > Am 06.06.2017 um 07:24 hat QingFeng Hao geschrieben:
> > > In load_snapshot, mis->from_src_file is freed twice, the first free is by
> > > qemu_fclose, the second is by migration_incoming_state_destroy and
> > > it causes Illegal instruction exception. The fix is just to remove the
> > > first free.
> > > 
> > > This problem is found by qemu-iotests case 068 since commit
> > > "660819b migration: shut src return path unconditionally". The error is:
> > > 068 1s ... - output mismatch (see 068.out.bad)
> > >     --- tests/qemu-iotests/068.out	2017-05-06 01:00:26.417270437 +0200
> > >     +++ 068.out.bad	2017-06-03 13:59:55.360274640 +0200
> > >     @@ -6,6 +6,8 @@
> > >      QEMU X.Y.Z monitor - type 'help' for more information
> > >      (qemu) savevm 0
> > >      (qemu) quit
> > >     +./common.config: line 107: 242472 Illegal instruction     (core dumped) ( if [ -n "${QEMU_NEED_PID}" ]; then
> > >     +    echo $BASHPID > "${QEMU_TEST_DIR}/qemu-${_QEMU_HANDLE}.pid";
> > >     +fi; exec "$QEMU_PROG" $QEMU_OPTIONS "$@" )
> > >      QEMU X.Y.Z monitor - type 'help' for more information
> > >     -(qemu) quit
> > >     -*** done
> > >     +(qemu) *** done
> > > 
> > > Signed-off-by: QingFeng Hao <haoqf@linux.vnet.ibm.com>
> > > Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > > Reviewed-by: Peter Xu <peterx@redhat.com>
> > 
> > Dave, as you only gave R-b rather than merging the patch, should this be
> > merged through the block tree?
> 
> I'm happy for it to go via block but also happy for it to go via
> migration; Juan is mostly doing the migration set at the moment since
> they're dominated by his cleanups.
> 
> > > diff --git a/migration/savevm.c b/migration/savevm.c
> > > index 9c320f59d0..853e14e34e 100644
> > > --- a/migration/savevm.c
> > > +++ b/migration/savevm.c
> > > @@ -2290,7 +2290,6 @@ int load_snapshot(const char *name, Error **errp)
> > >  
> > >      aio_context_acquire(aio_context);
> > >      ret = qemu_loadvm_state(f);
> > > -    qemu_fclose(f);
> > >      aio_context_release(aio_context);
> > >  
> > >      migration_incoming_state_destroy();
> > 
> > Did we check other callers of migration_incoming_state_destroy()?
> > 
> > For example, qmp_xen_load_devices_state() looks suspicious, too.
> 
> Hmm, it looks suspicious in the opposite direction; it never sets
> mis->from_src_file as was added by b4b076da into the load_snapshot path.

Agree.

Does qmp_xen_load_devices_state() needs to call
migration_incoming_state_destroy() after all? Since the latter
function only cleanups MigrationIncomingState and looks like the
former xen code didn't really use it at all.

> 
> > I can't tell for postcopy_ram_listen_thread() - commit 660819b didn't
> > seem to remove a qemu_fclose() call there, but I can't see one left
> > behind either. Was the file leaked before commit 660819b or am I
> > missing something?
> 
> I don't think there's a problem in the postcopy path, although hmm was
> I missing a close before?
> 
> Dave
> > 
> > Kevin
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
Dr. David Alan Gilbert June 7, 2017, 12:18 p.m. UTC | #6
* QingFeng Hao (haoqf@linux.vnet.ibm.com) wrote:
> 
> 
> 在 2017/6/6 20:49, Kevin Wolf 写道:
> > Am 06.06.2017 um 07:24 hat QingFeng Hao geschrieben:

<snip>

> > I can't tell for postcopy_ram_listen_thread() - commit 660819b didn't
> > seem to remove a qemu_fclose() call there, but I can't see one left
> > behind either. Was the file leaked before commit 660819b or am I
> > missing something?
> I don't think so because loadvm_postcopy_handle_listen creates thread
> postcopy_ram_listen_thread
> and passes mis->from_src_file as its arg, which will be closed by
> migration_incoming_state_destroy.
> What confuses me is in the series function calls of qemu_loadvm_state_main
> etc, argument f looks
> to be redundant as mis already contains from_src_file which equals to f.

In postcopy qemu_loadvm_state_main is called with two different file
arguments but the same mis argument;  see loadvm_handle_cmd_packaged for
the other case where it's called on a packaged-file blob.

> Furthermore, mis may be
> also redundant as it can be got via migration_incoming_get_current. Thanks!

We keep changing our minds about the preferred style.  Sometimes we
think it's best to pass the pointer, sometimes we think it's best
to call get_current.

Dave

> > 
> > Kevin
> > 
> 
> -- 
> Regards
> QingFeng Hao
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
Hao QingFeng June 8, 2017, 5:23 a.m. UTC | #7
在 2017/6/7 20:18, Dr. David Alan Gilbert 写道:
> * QingFeng Hao (haoqf@linux.vnet.ibm.com) wrote:
>>
>> 在 2017/6/6 20:49, Kevin Wolf 写道:
>>> Am 06.06.2017 um 07:24 hat QingFeng Hao geschrieben:
> <snip>
>
>>> I can't tell for postcopy_ram_listen_thread() - commit 660819b didn't
>>> seem to remove a qemu_fclose() call there, but I can't see one left
>>> behind either. Was the file leaked before commit 660819b or am I
>>> missing something?
>> I don't think so because loadvm_postcopy_handle_listen creates thread
>> postcopy_ram_listen_thread
>> and passes mis->from_src_file as its arg, which will be closed by
>> migration_incoming_state_destroy.
>> What confuses me is in the series function calls of qemu_loadvm_state_main
>> etc, argument f looks
>> to be redundant as mis already contains from_src_file which equals to f.
> In postcopy qemu_loadvm_state_main is called with two different file
> arguments but the same mis argument;  see loadvm_handle_cmd_packaged for
> the other case where it's called on a packaged-file blob.
yes, you are right, I missed that one. :)
>
>> Furthermore, mis may be
>> also redundant as it can be got via migration_incoming_get_current. Thanks!
> We keep changing our minds about the preferred style.  Sometimes we
> think it's best to pass the pointer, sometimes we think it's best
> to call get_current.
Got it. Thanks!
>
> Dave
>
>>> Kevin
>>>
>> -- 
>> Regards
>> QingFeng Hao
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
diff mbox

Patch

diff --git a/migration/savevm.c b/migration/savevm.c
index 9c320f59d0..853e14e34e 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2290,7 +2290,6 @@  int load_snapshot(const char *name, Error **errp)
 
     aio_context_acquire(aio_context);
     ret = qemu_loadvm_state(f);
-    qemu_fclose(f);
     aio_context_release(aio_context);
 
     migration_incoming_state_destroy();