Message ID | 20180604095520.8563-2-xiaoguangrong@tencent.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, Jun 04, 2018 at 05:55:09PM +0800, guangrong.xiao@gmail.com wrote: > From: Xiao Guangrong <xiaoguangrong@tencent.com> > > Instead of putting the main thread to sleep state to wait for > free compression thread, we can directly post it out as normal > page that reduces the latency and uses CPUs more efficiently The feature looks good, though I'm not sure whether we should make a capability flag for this feature since otherwise it'll be hard to switch back to the old full-compression way no matter for what reason. Would that be a problem? > > Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com> > --- > migration/ram.c | 34 +++++++++++++++------------------- > 1 file changed, 15 insertions(+), 19 deletions(-) > > diff --git a/migration/ram.c b/migration/ram.c > index 5bcbf7a9f9..0caf32ab0a 100644 > --- a/migration/ram.c > +++ b/migration/ram.c > @@ -1423,25 +1423,18 @@ static int compress_page_with_multi_thread(RAMState *rs, RAMBlock *block, > > thread_count = migrate_compress_threads(); > qemu_mutex_lock(&comp_done_lock); Can we drop this lock in this case? > - while (true) { > - for (idx = 0; idx < thread_count; idx++) { > - if (comp_param[idx].done) { > - comp_param[idx].done = false; > - bytes_xmit = qemu_put_qemu_file(rs->f, comp_param[idx].file); > - qemu_mutex_lock(&comp_param[idx].mutex); > - set_compress_params(&comp_param[idx], block, offset); > - qemu_cond_signal(&comp_param[idx].cond); > - qemu_mutex_unlock(&comp_param[idx].mutex); > - pages = 1; > - ram_counters.normal++; > - ram_counters.transferred += bytes_xmit; > - break; > - } > - } > - if (pages > 0) { > + for (idx = 0; idx < thread_count; idx++) { > + if (comp_param[idx].done) { > + comp_param[idx].done = false; > + bytes_xmit = qemu_put_qemu_file(rs->f, comp_param[idx].file); > + qemu_mutex_lock(&comp_param[idx].mutex); > + set_compress_params(&comp_param[idx], block, offset); > + qemu_cond_signal(&comp_param[idx].cond); > + qemu_mutex_unlock(&comp_param[idx].mutex); > + pages = 1; > + ram_counters.normal++; > + ram_counters.transferred += bytes_xmit; > break; > - } else { > - qemu_cond_wait(&comp_done_cond, &comp_done_lock); > } > } > qemu_mutex_unlock(&comp_done_lock); Regards,
On 06/11/2018 03:39 PM, Peter Xu wrote: > On Mon, Jun 04, 2018 at 05:55:09PM +0800, guangrong.xiao@gmail.com wrote: >> From: Xiao Guangrong <xiaoguangrong@tencent.com> >> >> Instead of putting the main thread to sleep state to wait for >> free compression thread, we can directly post it out as normal >> page that reduces the latency and uses CPUs more efficiently > > The feature looks good, though I'm not sure whether we should make a > capability flag for this feature since otherwise it'll be hard to > switch back to the old full-compression way no matter for what > reason. Would that be a problem? > We assume this optimization should always be optimistic for all cases, particularly, we introduced the statistics of compression, then the user should adjust its parameters based on those statistics if anything works worse. Furthermore, we really need to improve this optimization if it hurts any case rather than leaving a option to the user. :) >> >> Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com> >> --- >> migration/ram.c | 34 +++++++++++++++------------------- >> 1 file changed, 15 insertions(+), 19 deletions(-) >> >> diff --git a/migration/ram.c b/migration/ram.c >> index 5bcbf7a9f9..0caf32ab0a 100644 >> --- a/migration/ram.c >> +++ b/migration/ram.c >> @@ -1423,25 +1423,18 @@ static int compress_page_with_multi_thread(RAMState *rs, RAMBlock *block, >> >> thread_count = migrate_compress_threads(); >> qemu_mutex_lock(&comp_done_lock); > > Can we drop this lock in this case? The lock is used to protect comp_param[].done... Well, we are able to possibly remove it if we redesign the implementation, e.g, use atomic access for comp_param.done, however, it still can not work efficiently i believe. Please see more in the later reply to your comments in the cover-letter.
On Tue, Jun 12, 2018 at 10:42:25AM +0800, Xiao Guangrong wrote: > > > On 06/11/2018 03:39 PM, Peter Xu wrote: > > On Mon, Jun 04, 2018 at 05:55:09PM +0800, guangrong.xiao@gmail.com wrote: > > > From: Xiao Guangrong <xiaoguangrong@tencent.com> > > > > > > Instead of putting the main thread to sleep state to wait for > > > free compression thread, we can directly post it out as normal > > > page that reduces the latency and uses CPUs more efficiently > > > > The feature looks good, though I'm not sure whether we should make a > > capability flag for this feature since otherwise it'll be hard to > > switch back to the old full-compression way no matter for what > > reason. Would that be a problem? > > > > We assume this optimization should always be optimistic for all cases, > particularly, we introduced the statistics of compression, then the user > should adjust its parameters based on those statistics if anything works > worse. Ah, that'll be good. > > Furthermore, we really need to improve this optimization if it hurts > any case rather than leaving a option to the user. :) Yeah, even if we make it a parameter/capability we can still turn that on by default in new versions but keep the old behavior in old versions. :) The major difference is that, then we can still _have_ a way to compress every page. I'm just thinking if we don't have a switch for that then if someone wants to measure e.g. how a new compression algo could help VM migration, then he/she won't be possible to do that again since the numbers will be meaningless if that bit is out of control on which page will be compressed. Though I don't know how much use it'll bring... But if that won't be too hard, it still seems good. Not a strong opinion. > > > > > > > Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com> > > > --- > > > migration/ram.c | 34 +++++++++++++++------------------- > > > 1 file changed, 15 insertions(+), 19 deletions(-) > > > > > > diff --git a/migration/ram.c b/migration/ram.c > > > index 5bcbf7a9f9..0caf32ab0a 100644 > > > --- a/migration/ram.c > > > +++ b/migration/ram.c > > > @@ -1423,25 +1423,18 @@ static int compress_page_with_multi_thread(RAMState *rs, RAMBlock *block, > > > thread_count = migrate_compress_threads(); > > > qemu_mutex_lock(&comp_done_lock); > > > > Can we drop this lock in this case? > > The lock is used to protect comp_param[].done... IMHO it's okay? It's used in this way: if (done) { done = false; } So it only switches done from true->false. And the compression thread is the only one that did the other switch (false->true). IMHO this special case will allow no-lock since as long as "done" is true here then current thread will be the only one to modify it, then no race at all. > > Well, we are able to possibly remove it if we redesign the implementation, e.g, use atomic > access for comp_param.done, however, it still can not work efficiently i believe. Please see > more in the later reply to your comments in the cover-letter. Will read that after it arrives; though I didn't receive a reply. Have you missed clicking the "send" button? ;) Regards,
* Peter Xu (peterx@redhat.com) wrote: > On Tue, Jun 12, 2018 at 10:42:25AM +0800, Xiao Guangrong wrote: > > > > > > On 06/11/2018 03:39 PM, Peter Xu wrote: > > > On Mon, Jun 04, 2018 at 05:55:09PM +0800, guangrong.xiao@gmail.com wrote: > > > > From: Xiao Guangrong <xiaoguangrong@tencent.com> > > > > > > > > Instead of putting the main thread to sleep state to wait for > > > > free compression thread, we can directly post it out as normal > > > > page that reduces the latency and uses CPUs more efficiently > > > > > > The feature looks good, though I'm not sure whether we should make a > > > capability flag for this feature since otherwise it'll be hard to > > > switch back to the old full-compression way no matter for what > > > reason. Would that be a problem? > > > > > > > We assume this optimization should always be optimistic for all cases, > > particularly, we introduced the statistics of compression, then the user > > should adjust its parameters based on those statistics if anything works > > worse. > > Ah, that'll be good. > > > > > Furthermore, we really need to improve this optimization if it hurts > > any case rather than leaving a option to the user. :) > > Yeah, even if we make it a parameter/capability we can still turn that > on by default in new versions but keep the old behavior in old > versions. :) The major difference is that, then we can still _have_ a > way to compress every page. I'm just thinking if we don't have a > switch for that then if someone wants to measure e.g. how a new > compression algo could help VM migration, then he/she won't be > possible to do that again since the numbers will be meaningless if > that bit is out of control on which page will be compressed. > > Though I don't know how much use it'll bring... But if that won't be > too hard, it still seems good. Not a strong opinion. I think that is needed; it might be that some users have really awful networking and need the compression; I'd expect that for people who turn on compression they really expect the slowdown because they need it for their network, so changing that is a bit odd. Dave > > > > > > > > > > Signed-off-by: Xiao Guangrong <xiaoguangrong@tencent.com> > > > > --- > > > > migration/ram.c | 34 +++++++++++++++------------------- > > > > 1 file changed, 15 insertions(+), 19 deletions(-) > > > > > > > > diff --git a/migration/ram.c b/migration/ram.c > > > > index 5bcbf7a9f9..0caf32ab0a 100644 > > > > --- a/migration/ram.c > > > > +++ b/migration/ram.c > > > > @@ -1423,25 +1423,18 @@ static int compress_page_with_multi_thread(RAMState *rs, RAMBlock *block, > > > > thread_count = migrate_compress_threads(); > > > > qemu_mutex_lock(&comp_done_lock); > > > > > > Can we drop this lock in this case? > > > > The lock is used to protect comp_param[].done... > > IMHO it's okay? > > It's used in this way: > > if (done) { > done = false; > } > > So it only switches done from true->false. > > And the compression thread is the only one that did the other switch > (false->true). IMHO this special case will allow no-lock since as > long as "done" is true here then current thread will be the only one > to modify it, then no race at all. > > > > > Well, we are able to possibly remove it if we redesign the implementation, e.g, use atomic > > access for comp_param.done, however, it still can not work efficiently i believe. Please see > > more in the later reply to your comments in the cover-letter. > > Will read that after it arrives; though I didn't receive a reply. > Have you missed clicking the "send" button? ;) > > Regards, > > -- > Peter Xu -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
On 06/13/2018 11:43 PM, Dr. David Alan Gilbert wrote: > * Peter Xu (peterx@redhat.com) wrote: >> On Tue, Jun 12, 2018 at 10:42:25AM +0800, Xiao Guangrong wrote: >>> >>> >>> On 06/11/2018 03:39 PM, Peter Xu wrote: >>>> On Mon, Jun 04, 2018 at 05:55:09PM +0800, guangrong.xiao@gmail.com wrote: >>>>> From: Xiao Guangrong <xiaoguangrong@tencent.com> >>>>> >>>>> Instead of putting the main thread to sleep state to wait for >>>>> free compression thread, we can directly post it out as normal >>>>> page that reduces the latency and uses CPUs more efficiently >>>> >>>> The feature looks good, though I'm not sure whether we should make a >>>> capability flag for this feature since otherwise it'll be hard to >>>> switch back to the old full-compression way no matter for what >>>> reason. Would that be a problem? >>>> >>> >>> We assume this optimization should always be optimistic for all cases, >>> particularly, we introduced the statistics of compression, then the user >>> should adjust its parameters based on those statistics if anything works >>> worse. >> >> Ah, that'll be good. >> >>> >>> Furthermore, we really need to improve this optimization if it hurts >>> any case rather than leaving a option to the user. :) >> >> Yeah, even if we make it a parameter/capability we can still turn that >> on by default in new versions but keep the old behavior in old >> versions. :) The major difference is that, then we can still _have_ a >> way to compress every page. I'm just thinking if we don't have a >> switch for that then if someone wants to measure e.g. how a new >> compression algo could help VM migration, then he/she won't be >> possible to do that again since the numbers will be meaningless if >> that bit is out of control on which page will be compressed. >> >> Though I don't know how much use it'll bring... But if that won't be >> too hard, it still seems good. Not a strong opinion. > > I think that is needed; it might be that some users have really awful > networking and need the compression; I'd expect that for people who turn > on compression they really expect the slowdown because they need it for > their network, so changing that is a bit odd. People should make sure the system has enough CPU resource to do compression as well, so the perfect usage is that the 'busy-rate' is low enough i think. However, it's not a big deal, i will introduce a parameter, maybe, compress-wait-free-thread. Thank you all, Dave and Peter! :)
diff --git a/migration/ram.c b/migration/ram.c index 5bcbf7a9f9..0caf32ab0a 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -1423,25 +1423,18 @@ static int compress_page_with_multi_thread(RAMState *rs, RAMBlock *block, thread_count = migrate_compress_threads(); qemu_mutex_lock(&comp_done_lock); - while (true) { - for (idx = 0; idx < thread_count; idx++) { - if (comp_param[idx].done) { - comp_param[idx].done = false; - bytes_xmit = qemu_put_qemu_file(rs->f, comp_param[idx].file); - qemu_mutex_lock(&comp_param[idx].mutex); - set_compress_params(&comp_param[idx], block, offset); - qemu_cond_signal(&comp_param[idx].cond); - qemu_mutex_unlock(&comp_param[idx].mutex); - pages = 1; - ram_counters.normal++; - ram_counters.transferred += bytes_xmit; - break; - } - } - if (pages > 0) { + for (idx = 0; idx < thread_count; idx++) { + if (comp_param[idx].done) { + comp_param[idx].done = false; + bytes_xmit = qemu_put_qemu_file(rs->f, comp_param[idx].file); + qemu_mutex_lock(&comp_param[idx].mutex); + set_compress_params(&comp_param[idx], block, offset); + qemu_cond_signal(&comp_param[idx].cond); + qemu_mutex_unlock(&comp_param[idx].mutex); + pages = 1; + ram_counters.normal++; + ram_counters.transferred += bytes_xmit; break; - } else { - qemu_cond_wait(&comp_done_cond, &comp_done_lock); } } qemu_mutex_unlock(&comp_done_lock); @@ -1755,7 +1748,10 @@ static int ram_save_target_page(RAMState *rs, PageSearchStatus *pss, * CPU resource. */ if (block == rs->last_sent_block && save_page_use_compression(rs)) { - return compress_page_with_multi_thread(rs, block, offset); + res = compress_page_with_multi_thread(rs, block, offset); + if (res > 0) { + return res; + } } return ram_save_page(rs, pss, last_stage);