Message ID | 20220329152123.493731-1-iii@linux.ibm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | multifd: Copy pages before compressing them with zlib | expand |
Peter, Alex this is the fallout of Ilyas analysis of the s390x migration issue that triggered the DFLTCC workaround. Am 29.03.22 um 17:21 schrieb Ilya Leoshkevich: > zlib_send_prepare() compresses pages of a running VM. zlib does not > make any thread-safety guarantees with respect to changing deflate() > input concurrently with deflate() [1]. > > One can observe problems due to this with the IBM zEnterprise Data > Compression accelerator capable zlib [2]. When the hardware > acceleration is enabled, migration/multifd/tcp/zlib test fails > intermittently [3] due to sliding window corruption. > > At the moment this problem occurs only with this accelerator, since > its architecture explicitly discourages concurrent accesses [4]: > > Page 26-57, "Other Conditions": > > As observed by this CPU, other CPUs, and channel > programs, references to the parameter block, first, > second, and third operands may be multiple-access > references, accesses to these storage locations are > not necessarily block-concurrent, and the sequence > of these accesses or references is undefined. > > Still, it might affect other platforms due to a future zlib update. > Therefore, copy the page being compressed into a private buffer before > passing it to zlib. > > [1] https://zlib.net/manual.html > [2] https://github.com/madler/zlib/pull/410 > [3] https://lists.nongnu.org/archive/html/qemu-devel/2022-03/msg03988.html > [4] http://publibfp.dhe.ibm.com/epubs/pdf/a227832c.pdf > > Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> > --- > migration/multifd-zlib.c | 35 ++++++++++++++++++++++------------- > 1 file changed, 22 insertions(+), 13 deletions(-) > > diff --git a/migration/multifd-zlib.c b/migration/multifd-zlib.c > index 3a7ae44485..b6b22b7d1f 100644 > --- a/migration/multifd-zlib.c > +++ b/migration/multifd-zlib.c > @@ -27,6 +27,8 @@ struct zlib_data { > uint8_t *zbuff; > /* size of compressed buffer */ > uint32_t zbuff_len; > + /* uncompressed buffer */ > + uint8_t buf[]; > }; > > /* Multifd zlib compression */ > @@ -43,9 +45,18 @@ struct zlib_data { > */ > static int zlib_send_setup(MultiFDSendParams *p, Error **errp) > { > - struct zlib_data *z = g_new0(struct zlib_data, 1); > - z_stream *zs = &z->zs; > + /* This is the maximum size of the compressed buffer */ > + uint32_t zbuff_len = compressBound(MULTIFD_PACKET_SIZE); > + size_t buf_len = qemu_target_page_size(); > + struct zlib_data *z; > + z_stream *zs; > > + z = g_try_malloc0(sizeof(struct zlib_data) + buf_len + zbuff_len); > + if (!z) { > + error_setg(errp, "multifd %u: out of memory for zlib_data", p->id); > + return -1; > + } > + zs = &z->zs; > zs->zalloc = Z_NULL; > zs->zfree = Z_NULL; > zs->opaque = Z_NULL; > @@ -54,15 +65,8 @@ static int zlib_send_setup(MultiFDSendParams *p, Error **errp) > error_setg(errp, "multifd %u: deflate init failed", p->id); > return -1; > } > - /* This is the maxium size of the compressed buffer */ > - z->zbuff_len = compressBound(MULTIFD_PACKET_SIZE); > - z->zbuff = g_try_malloc(z->zbuff_len); > - if (!z->zbuff) { > - deflateEnd(&z->zs); > - g_free(z); > - error_setg(errp, "multifd %u: out of memory for zbuff", p->id); > - return -1; > - } > + z->zbuff_len = zbuff_len; > + z->zbuff = z->buf + buf_len; > p->data = z; > return 0; > } > @@ -80,7 +84,6 @@ static void zlib_send_cleanup(MultiFDSendParams *p, Error **errp) > struct zlib_data *z = p->data; > > deflateEnd(&z->zs); > - g_free(z->zbuff); > z->zbuff = NULL; > g_free(p->data); > p->data = NULL; > @@ -114,8 +117,14 @@ static int zlib_send_prepare(MultiFDSendParams *p, Error **errp) > flush = Z_SYNC_FLUSH; > } > > + /* > + * Since the VM might be running, the page may be changing concurrently > + * with compression. zlib does not guarantee that this is safe, > + * therefore copy the page before calling deflate(). > + */ > + memcpy(z->buf, p->pages->block->host + p->normal[i], page_size); > zs->avail_in = page_size; > - zs->next_in = p->pages->block->host + p->normal[i]; > + zs->next_in = z->buf; > > zs->avail_out = available; > zs->next_out = z->zbuff + out_size;
* Ilya Leoshkevich (iii@linux.ibm.com) wrote: > zlib_send_prepare() compresses pages of a running VM. zlib does not > make any thread-safety guarantees with respect to changing deflate() > input concurrently with deflate() [1]. > > One can observe problems due to this with the IBM zEnterprise Data > Compression accelerator capable zlib [2]. When the hardware > acceleration is enabled, migration/multifd/tcp/zlib test fails > intermittently [3] due to sliding window corruption. > > At the moment this problem occurs only with this accelerator, since > its architecture explicitly discourages concurrent accesses [4]: > > Page 26-57, "Other Conditions": > > As observed by this CPU, other CPUs, and channel > programs, references to the parameter block, first, > second, and third operands may be multiple-access > references, accesses to these storage locations are > not necessarily block-concurrent, and the sequence > of these accesses or references is undefined. > > Still, it might affect other platforms due to a future zlib update. > Therefore, copy the page being compressed into a private buffer before > passing it to zlib. While this might work around the problem; your explanation doesn't quite fit with the symptoms; or if they do, then you have a separate problem. The live migration code relies on the fact that the source is running and changing it's memory as the data is transmitted; however it also relies on the fact that if this happens the 'dirty' flag is set _after_ those changes causing another round of migration and retransmission of the (now stable) data. We don't expect the load of the data for the first page write to be correct, consistent etc - we just rely on the retransmission to be correct when the page is stable. If your compressor hardware is doing something undefined during the first case that's fine; as long as it works fine in the stable case where the data isn't changing. Adding the extra copy is going to slow everyone else dowmn; and since there's plenty of pthread lockingin those multifd I'm expecting them to get reasonably defined ordering and thus be safe from multi threading problems (please correct us if we've actually done something wrong in the locking there). IMHO your accelerator when called from a zlib call needs to behave the same as if it was the software implementation; i.e. if we've got pthread calls in there that are enforcing ordering then that should be fine; your accelerator implementation needs to add a barrier of some type or an internal copy, not penalise everyone else. Dave > > [1] https://zlib.net/manual.html > [2] https://github.com/madler/zlib/pull/410 > [3] https://lists.nongnu.org/archive/html/qemu-devel/2022-03/msg03988.html > [4] http://publibfp.dhe.ibm.com/epubs/pdf/a227832c.pdf > > Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> > --- > migration/multifd-zlib.c | 35 ++++++++++++++++++++++------------- > 1 file changed, 22 insertions(+), 13 deletions(-) > > diff --git a/migration/multifd-zlib.c b/migration/multifd-zlib.c > index 3a7ae44485..b6b22b7d1f 100644 > --- a/migration/multifd-zlib.c > +++ b/migration/multifd-zlib.c > @@ -27,6 +27,8 @@ struct zlib_data { > uint8_t *zbuff; > /* size of compressed buffer */ > uint32_t zbuff_len; > + /* uncompressed buffer */ > + uint8_t buf[]; > }; > > /* Multifd zlib compression */ > @@ -43,9 +45,18 @@ struct zlib_data { > */ > static int zlib_send_setup(MultiFDSendParams *p, Error **errp) > { > - struct zlib_data *z = g_new0(struct zlib_data, 1); > - z_stream *zs = &z->zs; > + /* This is the maximum size of the compressed buffer */ > + uint32_t zbuff_len = compressBound(MULTIFD_PACKET_SIZE); > + size_t buf_len = qemu_target_page_size(); > + struct zlib_data *z; > + z_stream *zs; > > + z = g_try_malloc0(sizeof(struct zlib_data) + buf_len + zbuff_len); > + if (!z) { > + error_setg(errp, "multifd %u: out of memory for zlib_data", p->id); > + return -1; > + } > + zs = &z->zs; > zs->zalloc = Z_NULL; > zs->zfree = Z_NULL; > zs->opaque = Z_NULL; > @@ -54,15 +65,8 @@ static int zlib_send_setup(MultiFDSendParams *p, Error **errp) > error_setg(errp, "multifd %u: deflate init failed", p->id); > return -1; > } > - /* This is the maxium size of the compressed buffer */ > - z->zbuff_len = compressBound(MULTIFD_PACKET_SIZE); > - z->zbuff = g_try_malloc(z->zbuff_len); > - if (!z->zbuff) { > - deflateEnd(&z->zs); > - g_free(z); > - error_setg(errp, "multifd %u: out of memory for zbuff", p->id); > - return -1; > - } > + z->zbuff_len = zbuff_len; > + z->zbuff = z->buf + buf_len; > p->data = z; > return 0; > } > @@ -80,7 +84,6 @@ static void zlib_send_cleanup(MultiFDSendParams *p, Error **errp) > struct zlib_data *z = p->data; > > deflateEnd(&z->zs); > - g_free(z->zbuff); > z->zbuff = NULL; > g_free(p->data); > p->data = NULL; > @@ -114,8 +117,14 @@ static int zlib_send_prepare(MultiFDSendParams *p, Error **errp) > flush = Z_SYNC_FLUSH; > } > > + /* > + * Since the VM might be running, the page may be changing concurrently > + * with compression. zlib does not guarantee that this is safe, > + * therefore copy the page before calling deflate(). > + */ > + memcpy(z->buf, p->pages->block->host + p->normal[i], page_size); > zs->avail_in = page_size; > - zs->next_in = p->pages->block->host + p->normal[i]; > + zs->next_in = z->buf; > > zs->avail_out = available; > zs->next_out = z->zbuff + out_size; > -- > 2.35.1 >
On Mon, 2022-04-04 at 12:20 +0100, Dr. David Alan Gilbert wrote: > * Ilya Leoshkevich (iii@linux.ibm.com) wrote: > > zlib_send_prepare() compresses pages of a running VM. zlib does not > > make any thread-safety guarantees with respect to changing > > deflate() > > input concurrently with deflate() [1]. > > > > One can observe problems due to this with the IBM zEnterprise Data > > Compression accelerator capable zlib [2]. When the hardware > > acceleration is enabled, migration/multifd/tcp/zlib test fails > > intermittently [3] due to sliding window corruption. > > > > At the moment this problem occurs only with this accelerator, since > > its architecture explicitly discourages concurrent accesses [4]: > > > > Page 26-57, "Other Conditions": > > > > As observed by this CPU, other CPUs, and channel > > programs, references to the parameter block, first, > > second, and third operands may be multiple-access > > references, accesses to these storage locations are > > not necessarily block-concurrent, and the sequence > > of these accesses or references is undefined. > > > > Still, it might affect other platforms due to a future zlib update. > > Therefore, copy the page being compressed into a private buffer > > before > > passing it to zlib. > > While this might work around the problem; your explanation doesn't > quite > fit with the symptoms; or if they do, then you have a separate > problem. > > The live migration code relies on the fact that the source is running > and changing it's memory as the data is transmitted; however it also > relies on the fact that if this happens the 'dirty' flag is set > _after_ > those changes causing another round of migration and retransmission > of > the (now stable) data. > > We don't expect the load of the data for the first page write to be > correct, consistent etc - we just rely on the retransmission to be > correct when the page is stable. > > If your compressor hardware is doing something undefined during the > first case that's fine; as long as it works fine in the stable case > where the data isn't changing. > > Adding the extra copy is going to slow everyone else dowmn; and since > there's plenty of pthread lockingin those multifd I'm expecting them > to get reasonably defined ordering and thus be safe from multi > threading > problems (please correct us if we've actually done something wrong in > the locking there). > > IMHO your accelerator when called from a zlib call needs to behave > the same as if it was the software implementation; i.e. if we've got > pthread calls in there that are enforcing ordering then that should > be > fine; your accelerator implementation needs to add a barrier of some > type or an internal copy, not penalise everyone else. > > Dave The problem with the accelerator is that during the first case the internal state might end up being corrupted (in particular: what goes into the deflate stream differs from what goes into the sliding window). This may affect the data integrity in the second case later on. I've been trying to think what to do with that, and of course doing an internal copy is one option (a barrier won't suffice). However, I realized that zlib API as documented doesn't guarantee that it's safe to change input data concurrently with compression. On the other hand, today's zlib is implemented in a way that tolerates this. So the open question for me is, whether we should honor zlib documentation (in which case, I would argue, QEMU needs to be changed) or say that the behavior of today's zlib implementation is more important (in which case accelerator code needs to change). I went with the former for now, but the latter is of course doable as well.
On Mon, Apr 04, 2022 at 12:20:14PM +0100, Dr. David Alan Gilbert wrote: > * Ilya Leoshkevich (iii@linux.ibm.com) wrote: > > zlib_send_prepare() compresses pages of a running VM. zlib does not > > make any thread-safety guarantees with respect to changing deflate() > > input concurrently with deflate() [1]. > > > > One can observe problems due to this with the IBM zEnterprise Data > > Compression accelerator capable zlib [2]. When the hardware > > acceleration is enabled, migration/multifd/tcp/zlib test fails > > intermittently [3] due to sliding window corruption. > > > > At the moment this problem occurs only with this accelerator, since > > its architecture explicitly discourages concurrent accesses [4]: > > > > Page 26-57, "Other Conditions": > > > > As observed by this CPU, other CPUs, and channel > > programs, references to the parameter block, first, > > second, and third operands may be multiple-access > > references, accesses to these storage locations are > > not necessarily block-concurrent, and the sequence > > of these accesses or references is undefined. > > > > Still, it might affect other platforms due to a future zlib update. > > Therefore, copy the page being compressed into a private buffer before > > passing it to zlib. > > While this might work around the problem; your explanation doesn't quite > fit with the symptoms; or if they do, then you have a separate problem. > > The live migration code relies on the fact that the source is running > and changing it's memory as the data is transmitted; however it also > relies on the fact that if this happens the 'dirty' flag is set _after_ > those changes causing another round of migration and retransmission of > the (now stable) data. > > We don't expect the load of the data for the first page write to be > correct, consistent etc - we just rely on the retransmission to be > correct when the page is stable. > > If your compressor hardware is doing something undefined during the > first case that's fine; as long as it works fine in the stable case > where the data isn't changing. > > Adding the extra copy is going to slow everyone else dowmn; and since > there's plenty of pthread lockingin those multifd I'm expecting them > to get reasonably defined ordering and thus be safe from multi threading > problems (please correct us if we've actually done something wrong in > the locking there). > > IMHO your accelerator when called from a zlib call needs to behave > the same as if it was the software implementation; i.e. if we've got > pthread calls in there that are enforcing ordering then that should be > fine; your accelerator implementation needs to add a barrier of some > type or an internal copy, not penalise everyone else. It is reasonable to argue that QEMU is relying on undefined behaviour when invoking zlib in this case, so it isn't clear that the accelerator impl should be changed, rather than QEMU be changed to follow the zlib API requirements. With regards, Daniel
Daniel P. Berrangé <berrange@redhat.com> wrote: > On Mon, Apr 04, 2022 at 12:20:14PM +0100, Dr. David Alan Gilbert wrote: >> * Ilya Leoshkevich (iii@linux.ibm.com) wrote: >> > zlib_send_prepare() compresses pages of a running VM. zlib does not >> > make any thread-safety guarantees with respect to changing deflate() >> > input concurrently with deflate() [1]. >> > >> > One can observe problems due to this with the IBM zEnterprise Data >> > Compression accelerator capable zlib [2]. When the hardware >> > acceleration is enabled, migration/multifd/tcp/zlib test fails >> > intermittently [3] due to sliding window corruption. >> > >> > At the moment this problem occurs only with this accelerator, since >> > its architecture explicitly discourages concurrent accesses [4]: >> > >> > Page 26-57, "Other Conditions": >> > >> > As observed by this CPU, other CPUs, and channel >> > programs, references to the parameter block, first, >> > second, and third operands may be multiple-access >> > references, accesses to these storage locations are >> > not necessarily block-concurrent, and the sequence >> > of these accesses or references is undefined. >> > >> > Still, it might affect other platforms due to a future zlib update. >> > Therefore, copy the page being compressed into a private buffer before >> > passing it to zlib. >> >> While this might work around the problem; your explanation doesn't quite >> fit with the symptoms; or if they do, then you have a separate problem. >> >> The live migration code relies on the fact that the source is running >> and changing it's memory as the data is transmitted; however it also >> relies on the fact that if this happens the 'dirty' flag is set _after_ >> those changes causing another round of migration and retransmission of >> the (now stable) data. >> >> We don't expect the load of the data for the first page write to be >> correct, consistent etc - we just rely on the retransmission to be >> correct when the page is stable. >> >> If your compressor hardware is doing something undefined during the >> first case that's fine; as long as it works fine in the stable case >> where the data isn't changing. >> >> Adding the extra copy is going to slow everyone else dowmn; and since >> there's plenty of pthread lockingin those multifd I'm expecting them >> to get reasonably defined ordering and thus be safe from multi threading >> problems (please correct us if we've actually done something wrong in >> the locking there). >> >> IMHO your accelerator when called from a zlib call needs to behave >> the same as if it was the software implementation; i.e. if we've got >> pthread calls in there that are enforcing ordering then that should be >> fine; your accelerator implementation needs to add a barrier of some >> type or an internal copy, not penalise everyone else. > > It is reasonable to argue that QEMU is relying on undefined behaviour > when invoking zlib in this case, so it isn't clear that the accelerator > impl should be changed, rather than QEMU be changed to follow the zlib > API requirements. It works on all the other cases. My vote if need taht is that we add a zlib-sync or similar method. zlib already means doing a copy, doing an extra copy will cost too much on my opinion. Once that we are here, is there such a requirement for zstd? In my testing, zstd was basically always better than zlib (no, I don't remember the details). Later, Juan.
* Ilya Leoshkevich (iii@linux.ibm.com) wrote: > On Mon, 2022-04-04 at 12:20 +0100, Dr. David Alan Gilbert wrote: > > * Ilya Leoshkevich (iii@linux.ibm.com) wrote: > > > zlib_send_prepare() compresses pages of a running VM. zlib does not > > > make any thread-safety guarantees with respect to changing > > > deflate() > > > input concurrently with deflate() [1]. > > > > > > One can observe problems due to this with the IBM zEnterprise Data > > > Compression accelerator capable zlib [2]. When the hardware > > > acceleration is enabled, migration/multifd/tcp/zlib test fails > > > intermittently [3] due to sliding window corruption. > > > > > > At the moment this problem occurs only with this accelerator, since > > > its architecture explicitly discourages concurrent accesses [4]: > > > > > > Page 26-57, "Other Conditions": > > > > > > As observed by this CPU, other CPUs, and channel > > > programs, references to the parameter block, first, > > > second, and third operands may be multiple-access > > > references, accesses to these storage locations are > > > not necessarily block-concurrent, and the sequence > > > of these accesses or references is undefined. > > > > > > Still, it might affect other platforms due to a future zlib update. > > > Therefore, copy the page being compressed into a private buffer > > > before > > > passing it to zlib. > > > > While this might work around the problem; your explanation doesn't > > quite > > fit with the symptoms; or if they do, then you have a separate > > problem. > > > > The live migration code relies on the fact that the source is running > > and changing it's memory as the data is transmitted; however it also > > relies on the fact that if this happens the 'dirty' flag is set > > _after_ > > those changes causing another round of migration and retransmission > > of > > the (now stable) data. > > > > We don't expect the load of the data for the first page write to be > > correct, consistent etc - we just rely on the retransmission to be > > correct when the page is stable. > > > > If your compressor hardware is doing something undefined during the > > first case that's fine; as long as it works fine in the stable case > > where the data isn't changing. > > > > Adding the extra copy is going to slow everyone else dowmn; and since > > there's plenty of pthread lockingin those multifd I'm expecting them > > to get reasonably defined ordering and thus be safe from multi > > threading > > problems (please correct us if we've actually done something wrong in > > the locking there). > > > > IMHO your accelerator when called from a zlib call needs to behave > > the same as if it was the software implementation; i.e. if we've got > > pthread calls in there that are enforcing ordering then that should > > be > > fine; your accelerator implementation needs to add a barrier of some > > type or an internal copy, not penalise everyone else. > > > > Dave > > The problem with the accelerator is that during the first case the > internal state might end up being corrupted (in particular: what goes > into the deflate stream differs from what goes into the sliding > window). This may affect the data integrity in the second case later > on. Hmm I hadn't expected the unpredictability to span multiple blocks. > I've been trying to think what to do with that, and of course doing an > internal copy is one option (a barrier won't suffice). However, I > realized that zlib API as documented doesn't guarantee that it's safe > to change input data concurrently with compression. On the other hand, > today's zlib is implemented in a way that tolerates this. > > So the open question for me is, whether we should honor zlib > documentation (in which case, I would argue, QEMU needs to be changed) > or say that the behavior of today's zlib implementation is more > important (in which case accelerator code needs to change). I went with > the former for now, but the latter is of course doable as well. Well I think you're saying that the current docs don't specify and thus assume that there's a constraint. I think the right people to answer this is the zlib community; so can you send a mail to zlib-devel and ask? Dave
diff --git a/migration/multifd-zlib.c b/migration/multifd-zlib.c index 3a7ae44485..b6b22b7d1f 100644 --- a/migration/multifd-zlib.c +++ b/migration/multifd-zlib.c @@ -27,6 +27,8 @@ struct zlib_data { uint8_t *zbuff; /* size of compressed buffer */ uint32_t zbuff_len; + /* uncompressed buffer */ + uint8_t buf[]; }; /* Multifd zlib compression */ @@ -43,9 +45,18 @@ struct zlib_data { */ static int zlib_send_setup(MultiFDSendParams *p, Error **errp) { - struct zlib_data *z = g_new0(struct zlib_data, 1); - z_stream *zs = &z->zs; + /* This is the maximum size of the compressed buffer */ + uint32_t zbuff_len = compressBound(MULTIFD_PACKET_SIZE); + size_t buf_len = qemu_target_page_size(); + struct zlib_data *z; + z_stream *zs; + z = g_try_malloc0(sizeof(struct zlib_data) + buf_len + zbuff_len); + if (!z) { + error_setg(errp, "multifd %u: out of memory for zlib_data", p->id); + return -1; + } + zs = &z->zs; zs->zalloc = Z_NULL; zs->zfree = Z_NULL; zs->opaque = Z_NULL; @@ -54,15 +65,8 @@ static int zlib_send_setup(MultiFDSendParams *p, Error **errp) error_setg(errp, "multifd %u: deflate init failed", p->id); return -1; } - /* This is the maxium size of the compressed buffer */ - z->zbuff_len = compressBound(MULTIFD_PACKET_SIZE); - z->zbuff = g_try_malloc(z->zbuff_len); - if (!z->zbuff) { - deflateEnd(&z->zs); - g_free(z); - error_setg(errp, "multifd %u: out of memory for zbuff", p->id); - return -1; - } + z->zbuff_len = zbuff_len; + z->zbuff = z->buf + buf_len; p->data = z; return 0; } @@ -80,7 +84,6 @@ static void zlib_send_cleanup(MultiFDSendParams *p, Error **errp) struct zlib_data *z = p->data; deflateEnd(&z->zs); - g_free(z->zbuff); z->zbuff = NULL; g_free(p->data); p->data = NULL; @@ -114,8 +117,14 @@ static int zlib_send_prepare(MultiFDSendParams *p, Error **errp) flush = Z_SYNC_FLUSH; } + /* + * Since the VM might be running, the page may be changing concurrently + * with compression. zlib does not guarantee that this is safe, + * therefore copy the page before calling deflate(). + */ + memcpy(z->buf, p->pages->block->host + p->normal[i], page_size); zs->avail_in = page_size; - zs->next_in = p->pages->block->host + p->normal[i]; + zs->next_in = z->buf; zs->avail_out = available; zs->next_out = z->zbuff + out_size;
zlib_send_prepare() compresses pages of a running VM. zlib does not make any thread-safety guarantees with respect to changing deflate() input concurrently with deflate() [1]. One can observe problems due to this with the IBM zEnterprise Data Compression accelerator capable zlib [2]. When the hardware acceleration is enabled, migration/multifd/tcp/zlib test fails intermittently [3] due to sliding window corruption. At the moment this problem occurs only with this accelerator, since its architecture explicitly discourages concurrent accesses [4]: Page 26-57, "Other Conditions": As observed by this CPU, other CPUs, and channel programs, references to the parameter block, first, second, and third operands may be multiple-access references, accesses to these storage locations are not necessarily block-concurrent, and the sequence of these accesses or references is undefined. Still, it might affect other platforms due to a future zlib update. Therefore, copy the page being compressed into a private buffer before passing it to zlib. [1] https://zlib.net/manual.html [2] https://github.com/madler/zlib/pull/410 [3] https://lists.nongnu.org/archive/html/qemu-devel/2022-03/msg03988.html [4] http://publibfp.dhe.ibm.com/epubs/pdf/a227832c.pdf Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> --- migration/multifd-zlib.c | 35 ++++++++++++++++++++++------------- 1 file changed, 22 insertions(+), 13 deletions(-)