Message ID | 20180719121520.30026-9-xiaoguangrong@tencent.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, Jul 19, 2018 at 08:15:20PM +0800, guangrong.xiao@gmail.com wrote: > From: Xiao Guangrong <xiaoguangrong@tencent.com> > > flush_compressed_data() needs to wait all compression threads to > finish their work, after that all threads are free until the > migration feeds new request to them, reducing its call can improve > the throughput and use CPU resource more effectively > > We do not need to flush all threads at the end of iteration, the > data can be kept locally until the memory block is changed or > memory migration starts over in that case we will meet a dirtied > page which may still exists in compression threads's ring > > Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com> > --- > migration/ram.c | 15 ++++++++++++++- > 1 file changed, 14 insertions(+), 1 deletion(-) > > diff --git a/migration/ram.c b/migration/ram.c > index 89305c7af5..fdab13821d 100644 > --- a/migration/ram.c > +++ b/migration/ram.c > @@ -315,6 +315,8 @@ struct RAMState { > uint64_t iterations; > /* number of dirty bits in the bitmap */ > uint64_t migration_dirty_pages; > + /* last dirty_sync_count we have seen */ > + uint64_t dirty_sync_count; Better suffix it with "_prev" as well? So that we can quickly identify that it's only a cache and it can be different from the one in the ram_counters. > /* protects modification of the bitmap */ > QemuMutex bitmap_mutex; > /* The RAMBlock used in the last src_page_requests */ > @@ -2532,6 +2534,7 @@ static void ram_save_cleanup(void *opaque) > } > > xbzrle_cleanup(); > + flush_compressed_data(*rsp); Could I ask why do we need this considering that we have compress_threads_save_cleanup() right down there? > compress_threads_save_cleanup(); > ram_state_cleanup(rsp); > } > @@ -3203,6 +3206,17 @@ static int ram_save_iterate(QEMUFile *f, void *opaque) > > ram_control_before_iterate(f, RAM_CONTROL_ROUND); > > + /* > + * if memory migration starts over, we will meet a dirtied page which > + * may still exists in compression threads's ring, so we should flush > + * the compressed data to make sure the new page is not overwritten by > + * the old one in the destination. > + */ > + if (ram_counters.dirty_sync_count != rs->dirty_sync_count) { > + rs->dirty_sync_count = ram_counters.dirty_sync_count; > + flush_compressed_data(rs); > + } > + > t0 = qemu_clock_get_ns(QEMU_CLOCK_REALTIME); > i = 0; > while ((ret = qemu_file_rate_limit(f)) == 0 || > @@ -3235,7 +3249,6 @@ static int ram_save_iterate(QEMUFile *f, void *opaque) > } > i++; > } > - flush_compressed_data(rs); This looks sane to me, but I'd like to see how other people would think about it too... > rcu_read_unlock(); > > /* > -- > 2.14.4 > Regards,
On 07/23/2018 01:49 PM, Peter Xu wrote: > On Thu, Jul 19, 2018 at 08:15:20PM +0800, guangrong.xiao@gmail.com wrote: >> From: Xiao Guangrong <xiaoguangrong@tencent.com> >> >> flush_compressed_data() needs to wait all compression threads to >> finish their work, after that all threads are free until the >> migration feeds new request to them, reducing its call can improve >> the throughput and use CPU resource more effectively >> >> We do not need to flush all threads at the end of iteration, the >> data can be kept locally until the memory block is changed or >> memory migration starts over in that case we will meet a dirtied >> page which may still exists in compression threads's ring >> >> Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com> >> --- >> migration/ram.c | 15 ++++++++++++++- >> 1 file changed, 14 insertions(+), 1 deletion(-) >> >> diff --git a/migration/ram.c b/migration/ram.c >> index 89305c7af5..fdab13821d 100644 >> --- a/migration/ram.c >> +++ b/migration/ram.c >> @@ -315,6 +315,8 @@ struct RAMState { >> uint64_t iterations; >> /* number of dirty bits in the bitmap */ >> uint64_t migration_dirty_pages; >> + /* last dirty_sync_count we have seen */ >> + uint64_t dirty_sync_count; > > Better suffix it with "_prev" as well? So that we can quickly > identify that it's only a cache and it can be different from the one > in the ram_counters. Indeed, will update it. > >> /* protects modification of the bitmap */ >> QemuMutex bitmap_mutex; >> /* The RAMBlock used in the last src_page_requests */ >> @@ -2532,6 +2534,7 @@ static void ram_save_cleanup(void *opaque) >> } >> >> xbzrle_cleanup(); >> + flush_compressed_data(*rsp); > > Could I ask why do we need this considering that we have > compress_threads_save_cleanup() right down there? Dave ask it too. :( "This is for the error condition, if any error occurred during live migration, there is no chance to call ram_save_complete. After using the lockless multithreads model, we assert all requests have been handled before destroy the work threads." That makes sure there is nothing left in the threads before doing compress_threads_save_cleanup() as current behavior. For lockless mutilthread model, we check if all requests are free before destroy them. > >> compress_threads_save_cleanup(); >> ram_state_cleanup(rsp); >> } >> @@ -3203,6 +3206,17 @@ static int ram_save_iterate(QEMUFile *f, void *opaque) >> >> ram_control_before_iterate(f, RAM_CONTROL_ROUND); >> >> + /* >> + * if memory migration starts over, we will meet a dirtied page which >> + * may still exists in compression threads's ring, so we should flush >> + * the compressed data to make sure the new page is not overwritten by >> + * the old one in the destination. >> + */ >> + if (ram_counters.dirty_sync_count != rs->dirty_sync_count) { >> + rs->dirty_sync_count = ram_counters.dirty_sync_count; >> + flush_compressed_data(rs); >> + } >> + >> t0 = qemu_clock_get_ns(QEMU_CLOCK_REALTIME); >> i = 0; >> while ((ret = qemu_file_rate_limit(f)) == 0 || >> @@ -3235,7 +3249,6 @@ static int ram_save_iterate(QEMUFile *f, void *opaque) >> } >> i++; >> } >> - flush_compressed_data(rs); > > This looks sane to me, but I'd like to see how other people would > think about it too... Thank you a lot, Peter! :)
On Mon, Jul 23, 2018 at 04:05:21PM +0800, Xiao Guangrong wrote: > > > On 07/23/2018 01:49 PM, Peter Xu wrote: > > On Thu, Jul 19, 2018 at 08:15:20PM +0800, guangrong.xiao@gmail.com wrote: > > > From: Xiao Guangrong <xiaoguangrong@tencent.com> > > > > > > flush_compressed_data() needs to wait all compression threads to > > > finish their work, after that all threads are free until the > > > migration feeds new request to them, reducing its call can improve > > > the throughput and use CPU resource more effectively > > > > > > We do not need to flush all threads at the end of iteration, the > > > data can be kept locally until the memory block is changed or > > > memory migration starts over in that case we will meet a dirtied > > > page which may still exists in compression threads's ring > > > > > > Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com> > > > --- > > > migration/ram.c | 15 ++++++++++++++- > > > 1 file changed, 14 insertions(+), 1 deletion(-) > > > > > > diff --git a/migration/ram.c b/migration/ram.c > > > index 89305c7af5..fdab13821d 100644 > > > --- a/migration/ram.c > > > +++ b/migration/ram.c > > > @@ -315,6 +315,8 @@ struct RAMState { > > > uint64_t iterations; > > > /* number of dirty bits in the bitmap */ > > > uint64_t migration_dirty_pages; > > > + /* last dirty_sync_count we have seen */ > > > + uint64_t dirty_sync_count; > > > > Better suffix it with "_prev" as well? So that we can quickly > > identify that it's only a cache and it can be different from the one > > in the ram_counters. > > Indeed, will update it. > > > > > > /* protects modification of the bitmap */ > > > QemuMutex bitmap_mutex; > > > /* The RAMBlock used in the last src_page_requests */ > > > @@ -2532,6 +2534,7 @@ static void ram_save_cleanup(void *opaque) > > > } > > > xbzrle_cleanup(); > > > + flush_compressed_data(*rsp); > > > > Could I ask why do we need this considering that we have > > compress_threads_save_cleanup() right down there? > > Dave ask it too. :( > > "This is for the error condition, if any error occurred during live migration, > there is no chance to call ram_save_complete. After using the lockless > multithreads model, we assert all requests have been handled before destroy > the work threads." > > That makes sure there is nothing left in the threads before doing > compress_threads_save_cleanup() as current behavior. For lockless > mutilthread model, we check if all requests are free before destroy > them. But why do we need to explicitly flush it here? Now in compress_threads_save_cleanup() we have qemu_fclose() on the buffers, which logically will flush the data and clean up everything too. Would that suffice? > > > > > > compress_threads_save_cleanup(); > > > ram_state_cleanup(rsp); > > > } > > > @@ -3203,6 +3206,17 @@ static int ram_save_iterate(QEMUFile *f, void *opaque) > > > ram_control_before_iterate(f, RAM_CONTROL_ROUND); > > > + /* > > > + * if memory migration starts over, we will meet a dirtied page which > > > + * may still exists in compression threads's ring, so we should flush > > > + * the compressed data to make sure the new page is not overwritten by > > > + * the old one in the destination. > > > + */ > > > + if (ram_counters.dirty_sync_count != rs->dirty_sync_count) { > > > + rs->dirty_sync_count = ram_counters.dirty_sync_count; > > > + flush_compressed_data(rs); > > > + } > > > + > > > t0 = qemu_clock_get_ns(QEMU_CLOCK_REALTIME); > > > i = 0; > > > while ((ret = qemu_file_rate_limit(f)) == 0 || > > > @@ -3235,7 +3249,6 @@ static int ram_save_iterate(QEMUFile *f, void *opaque) > > > } > > > i++; > > > } > > > - flush_compressed_data(rs); > > > > This looks sane to me, but I'd like to see how other people would > > think about it too... > > Thank you a lot, Peter! :) Welcome. :) Regards,
On 07/23/2018 04:35 PM, Peter Xu wrote: > On Mon, Jul 23, 2018 at 04:05:21PM +0800, Xiao Guangrong wrote: >> >> >> On 07/23/2018 01:49 PM, Peter Xu wrote: >>> On Thu, Jul 19, 2018 at 08:15:20PM +0800, guangrong.xiao@gmail.com wrote: >>>> From: Xiao Guangrong <xiaoguangrong@tencent.com> >>>> >>>> flush_compressed_data() needs to wait all compression threads to >>>> finish their work, after that all threads are free until the >>>> migration feeds new request to them, reducing its call can improve >>>> the throughput and use CPU resource more effectively >>>> >>>> We do not need to flush all threads at the end of iteration, the >>>> data can be kept locally until the memory block is changed or >>>> memory migration starts over in that case we will meet a dirtied >>>> page which may still exists in compression threads's ring >>>> >>>> Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com> >>>> --- >>>> migration/ram.c | 15 ++++++++++++++- >>>> 1 file changed, 14 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/migration/ram.c b/migration/ram.c >>>> index 89305c7af5..fdab13821d 100644 >>>> --- a/migration/ram.c >>>> +++ b/migration/ram.c >>>> @@ -315,6 +315,8 @@ struct RAMState { >>>> uint64_t iterations; >>>> /* number of dirty bits in the bitmap */ >>>> uint64_t migration_dirty_pages; >>>> + /* last dirty_sync_count we have seen */ >>>> + uint64_t dirty_sync_count; >>> >>> Better suffix it with "_prev" as well? So that we can quickly >>> identify that it's only a cache and it can be different from the one >>> in the ram_counters. >> >> Indeed, will update it. >> >>> >>>> /* protects modification of the bitmap */ >>>> QemuMutex bitmap_mutex; >>>> /* The RAMBlock used in the last src_page_requests */ >>>> @@ -2532,6 +2534,7 @@ static void ram_save_cleanup(void *opaque) >>>> } >>>> xbzrle_cleanup(); >>>> + flush_compressed_data(*rsp); >>> >>> Could I ask why do we need this considering that we have >>> compress_threads_save_cleanup() right down there? >> >> Dave ask it too. :( >> >> "This is for the error condition, if any error occurred during live migration, >> there is no chance to call ram_save_complete. After using the lockless >> multithreads model, we assert all requests have been handled before destroy >> the work threads." >> >> That makes sure there is nothing left in the threads before doing >> compress_threads_save_cleanup() as current behavior. For lockless >> mutilthread model, we check if all requests are free before destroy >> them. > > But why do we need to explicitly flush it here? Now in > compress_threads_save_cleanup() we have qemu_fclose() on the buffers, > which logically will flush the data and clean up everything too. > Would that suffice? > Yes, it's sufficient for current thread model, will drop it for now and add it at the time when the lockless mutilthread model is applied. :)
On Mon, Jul 23, 2018 at 04:53:11PM +0800, Xiao Guangrong wrote: > > > On 07/23/2018 04:35 PM, Peter Xu wrote: > > On Mon, Jul 23, 2018 at 04:05:21PM +0800, Xiao Guangrong wrote: > > > > > > > > > On 07/23/2018 01:49 PM, Peter Xu wrote: > > > > On Thu, Jul 19, 2018 at 08:15:20PM +0800, guangrong.xiao@gmail.com wrote: > > > > > From: Xiao Guangrong <xiaoguangrong@tencent.com> > > > > > > > > > > flush_compressed_data() needs to wait all compression threads to > > > > > finish their work, after that all threads are free until the > > > > > migration feeds new request to them, reducing its call can improve > > > > > the throughput and use CPU resource more effectively > > > > > > > > > > We do not need to flush all threads at the end of iteration, the > > > > > data can be kept locally until the memory block is changed or > > > > > memory migration starts over in that case we will meet a dirtied > > > > > page which may still exists in compression threads's ring > > > > > > > > > > Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com> > > > > > --- > > > > > migration/ram.c | 15 ++++++++++++++- > > > > > 1 file changed, 14 insertions(+), 1 deletion(-) > > > > > > > > > > diff --git a/migration/ram.c b/migration/ram.c > > > > > index 89305c7af5..fdab13821d 100644 > > > > > --- a/migration/ram.c > > > > > +++ b/migration/ram.c > > > > > @@ -315,6 +315,8 @@ struct RAMState { > > > > > uint64_t iterations; > > > > > /* number of dirty bits in the bitmap */ > > > > > uint64_t migration_dirty_pages; > > > > > + /* last dirty_sync_count we have seen */ > > > > > + uint64_t dirty_sync_count; > > > > > > > > Better suffix it with "_prev" as well? So that we can quickly > > > > identify that it's only a cache and it can be different from the one > > > > in the ram_counters. > > > > > > Indeed, will update it. > > > > > > > > > > > > /* protects modification of the bitmap */ > > > > > QemuMutex bitmap_mutex; > > > > > /* The RAMBlock used in the last src_page_requests */ > > > > > @@ -2532,6 +2534,7 @@ static void ram_save_cleanup(void *opaque) > > > > > } > > > > > xbzrle_cleanup(); > > > > > + flush_compressed_data(*rsp); > > > > > > > > Could I ask why do we need this considering that we have > > > > compress_threads_save_cleanup() right down there? > > > > > > Dave ask it too. :( > > > > > > "This is for the error condition, if any error occurred during live migration, > > > there is no chance to call ram_save_complete. After using the lockless > > > multithreads model, we assert all requests have been handled before destroy > > > the work threads." > > > > > > That makes sure there is nothing left in the threads before doing > > > compress_threads_save_cleanup() as current behavior. For lockless > > > mutilthread model, we check if all requests are free before destroy > > > them. > > > > But why do we need to explicitly flush it here? Now in > > compress_threads_save_cleanup() we have qemu_fclose() on the buffers, > > which logically will flush the data and clean up everything too. > > Would that suffice? > > > > Yes, it's sufficient for current thread model, will drop it for now > and add it at the time when the lockless mutilthread model is applied. :) Ah I think I see your point. Even if so I would think it better to do any extra cleanup directly in compress_threads_save_cleanup() if possible. Regards,
On 07/23/2018 05:01 PM, Peter Xu wrote: >> Yes, it's sufficient for current thread model, will drop it for now >> and add it at the time when the lockless mutilthread model is applied. :) > > Ah I think I see your point. Even if so I would think it better to do > any extra cleanup directly in compress_threads_save_cleanup() if > possible. > Okay, got it.
diff --git a/migration/ram.c b/migration/ram.c index 89305c7af5..fdab13821d 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -315,6 +315,8 @@ struct RAMState { uint64_t iterations; /* number of dirty bits in the bitmap */ uint64_t migration_dirty_pages; + /* last dirty_sync_count we have seen */ + uint64_t dirty_sync_count; /* protects modification of the bitmap */ QemuMutex bitmap_mutex; /* The RAMBlock used in the last src_page_requests */ @@ -2532,6 +2534,7 @@ static void ram_save_cleanup(void *opaque) } xbzrle_cleanup(); + flush_compressed_data(*rsp); compress_threads_save_cleanup(); ram_state_cleanup(rsp); } @@ -3203,6 +3206,17 @@ static int ram_save_iterate(QEMUFile *f, void *opaque) ram_control_before_iterate(f, RAM_CONTROL_ROUND); + /* + * if memory migration starts over, we will meet a dirtied page which + * may still exists in compression threads's ring, so we should flush + * the compressed data to make sure the new page is not overwritten by + * the old one in the destination. + */ + if (ram_counters.dirty_sync_count != rs->dirty_sync_count) { + rs->dirty_sync_count = ram_counters.dirty_sync_count; + flush_compressed_data(rs); + } + t0 = qemu_clock_get_ns(QEMU_CLOCK_REALTIME); i = 0; while ((ret = qemu_file_rate_limit(f)) == 0 || @@ -3235,7 +3249,6 @@ static int ram_save_iterate(QEMUFile *f, void *opaque) } i++; } - flush_compressed_data(rs); rcu_read_unlock(); /*