Message ID | 20250107195025.9951-7-farosas@suse.de (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | migration: Fix s390 regressions + migration script | expand |
On Tue, Jan 07, 2025 at 04:50:24PM -0300, Fabiano Rosas wrote: > Currently, if an array of pointers contains a NULL pointer, that > pointer will be encoded as '0' in the stream. Since the JSON writer > doesn't define a "pointer" type, that '0' will now be an uint64, which > is different from the original type being pointed to, e.g. struct. > > That mixed-type array shouldn't be compressed, otherwise data is lost > as the code currently makes the whole array have the type of the first > element. > > While we could disable the array compression when a NULL pointer is > found, the JSON part of the stream still makes part of downtime, so we > should avoid writing unecessary bytes to it. > > Keep the array compression in place, but break the array into several > type-contiguous pieces if NULL and non-NULL pointers are mixed. Could I request for a sample JSON dump for an example array in the commit log? This whole solution looks working but is tricky. A sample could help people understand (e.g. showing the same "name" being dumped multiple times..). Side note: I tried to dump a very basic VM's JSON out to disk, it scares me on the size: $ ls -lhS JSON.out -rw-r--r--. 1 peterx peterx 106K Jan 7 17:18 JSON.out That's a simplest VM with all default stuff, mostly nothing complex.. I may really need to measure how the JSON debug strings affect migration function or perf at some point.. > > Signed-off-by: Fabiano Rosas <farosas@suse.de> > --- > migration/vmstate.c | 33 ++++++++++++++++++++++++++++++++- > scripts/analyze-migration.py | 9 ++++++++- > 2 files changed, 40 insertions(+), 2 deletions(-) > > diff --git a/migration/vmstate.c b/migration/vmstate.c > index 52704c822c..a79ccf3875 100644 > --- a/migration/vmstate.c > +++ b/migration/vmstate.c > @@ -425,15 +425,19 @@ int vmstate_save_state_v(QEMUFile *f, const VMStateDescription *vmsd, > int size = vmstate_size(opaque, field); > uint64_t old_offset, written_bytes; > JSONWriter *vmdesc_loop = vmdesc; > + bool is_prev_null = false; > > trace_vmstate_save_state_loop(vmsd->name, field->name, n_elems); > if (field->flags & VMS_POINTER) { > first_elem = *(void **)first_elem; > assert(first_elem || !n_elems || !size); > } > + > for (i = 0; i < n_elems; i++) { > void *curr_elem = first_elem + size * i; > const VMStateField *inner_field; > + bool is_null; > + int max_elems = n_elems - i; > > old_offset = qemu_file_transferred(f); > if (field->flags & VMS_ARRAY_OF_POINTER) { > @@ -448,12 +452,39 @@ int vmstate_save_state_v(QEMUFile *f, const VMStateDescription *vmsd, > * not follow. > */ > inner_field = vmsd_create_fake_nullptr_field(field); > + is_null = true; > } else { > inner_field = field; > + is_null = false; > + } > + > + /* > + * Due to the fake nullptr handling above, if there's mixed > + * null/non-null data, it doesn't make sense to emit a > + * compressed array representation spanning the entire array > + * because the field types will be different (e.g. struct > + * vs. uint64_t). Search ahead for the next null/non-null > + * element and start a new compressed array if found. > + */ > + if (field->flags & VMS_ARRAY_OF_POINTER && > + is_null != is_prev_null) { > + > + is_prev_null = is_null; > + vmdesc_loop = vmdesc; > + > + for (int j = i + 1; j < n_elems; j++) { > + void *elem = *(void **)(first_elem + size * j); > + bool elem_is_null = !elem && size; > + > + if (is_null != elem_is_null) { > + max_elems = j - i; > + break; > + } > + } > } > > vmsd_desc_field_start(vmsd, vmdesc_loop, inner_field, > - i, n_elems); > + i, max_elems); > > if (inner_field->flags & VMS_STRUCT) { > ret = vmstate_save_state(f, inner_field->vmsd, > diff --git a/scripts/analyze-migration.py b/scripts/analyze-migration.py > index 4836920ddc..9138e91a11 100755 > --- a/scripts/analyze-migration.py > +++ b/scripts/analyze-migration.py > @@ -497,7 +497,14 @@ def read(self): > raise Exception("internal index of data field unmatched (%d/%d)" % (len(a), int(field['index']))) > a.append(field['data']) > else: > - self.data[field['name']] = field['data'] > + # There could be multiple entries for the same field > + # name, e.g. when a compressed array was broken in > + # more than one piece. > + if (field['name'] in self.data and > + type(self.data[field['name']]) == list): > + self.data[field['name']].append(field['data']) > + else: > + self.data[field['name']] = field['data'] Do we realy need these script changes? I thought VMSDFieldStruct always breaks array_len field into "index" based anyway? new_fields = [] for field in self.desc['struct']['fields']: if not 'array_len' in field: new_fields.append(field) continue array_len = field.pop('array_len') field['index'] = 0 new_fields.append(field) for i in range(1, array_len): c = field.copy() c['index'] = i new_fields.append(c) self.desc['struct']['fields'] = new_fields
Peter Xu <peterx@redhat.com> writes: > On Tue, Jan 07, 2025 at 04:50:24PM -0300, Fabiano Rosas wrote: >> Currently, if an array of pointers contains a NULL pointer, that >> pointer will be encoded as '0' in the stream. Since the JSON writer >> doesn't define a "pointer" type, that '0' will now be an uint64, which >> is different from the original type being pointed to, e.g. struct. >> >> That mixed-type array shouldn't be compressed, otherwise data is lost >> as the code currently makes the whole array have the type of the first >> element. >> >> While we could disable the array compression when a NULL pointer is >> found, the JSON part of the stream still makes part of downtime, so we >> should avoid writing unecessary bytes to it. >> >> Keep the array compression in place, but break the array into several >> type-contiguous pieces if NULL and non-NULL pointers are mixed. > > Could I request for a sample JSON dump for an example array in the commit > log? This whole solution looks working but is tricky. A sample could help > people understand (e.g. showing the same "name" being dumped multiple > times..). {"name": "s390_css", "instance_id": 0, "vmsd_name": "s390_css", "version": 1, "fields": [ ..., {"name": "css", "array_len": 254, "type": "uint8", "size": 1}, {"name": "css", "type": "struct", "struct": { "vmsd_name": "s390_css_img", "version": 1, "fields": [{"name": "chpids", "array_len": 256, "type": "struct", "struct": {"vmsd_name": "s390_chp_info", "version": 1, "fields": [{"name": "in_use", "type": "uint8", "size": 1}, {"name": "type", "type": "uint8", "size": 1}, {"name": "is_virtual", "type": "uint8", "size": 1}]}, "size": 3}]}, "size": 768}, {"name": "css", "type": "uint8", "size": 1}, ... ]} > > Side note: I tried to dump a very basic VM's JSON out to disk, it scares me > on the size: > > $ ls -lhS JSON.out > -rw-r--r--. 1 peterx peterx 106K Jan 7 17:18 JSON.out > > That's a simplest VM with all default stuff, mostly nothing complex.. I may > really need to measure how the JSON debug strings affect migration function > or perf at some point.. > Agreed. >> >> Signed-off-by: Fabiano Rosas <farosas@suse.de> >> --- >> migration/vmstate.c | 33 ++++++++++++++++++++++++++++++++- >> scripts/analyze-migration.py | 9 ++++++++- >> 2 files changed, 40 insertions(+), 2 deletions(-) >> >> diff --git a/migration/vmstate.c b/migration/vmstate.c >> index 52704c822c..a79ccf3875 100644 >> --- a/migration/vmstate.c >> +++ b/migration/vmstate.c >> @@ -425,15 +425,19 @@ int vmstate_save_state_v(QEMUFile *f, const VMStateDescription *vmsd, >> int size = vmstate_size(opaque, field); >> uint64_t old_offset, written_bytes; >> JSONWriter *vmdesc_loop = vmdesc; >> + bool is_prev_null = false; >> >> trace_vmstate_save_state_loop(vmsd->name, field->name, n_elems); >> if (field->flags & VMS_POINTER) { >> first_elem = *(void **)first_elem; >> assert(first_elem || !n_elems || !size); >> } >> + >> for (i = 0; i < n_elems; i++) { >> void *curr_elem = first_elem + size * i; >> const VMStateField *inner_field; >> + bool is_null; >> + int max_elems = n_elems - i; >> >> old_offset = qemu_file_transferred(f); >> if (field->flags & VMS_ARRAY_OF_POINTER) { >> @@ -448,12 +452,39 @@ int vmstate_save_state_v(QEMUFile *f, const VMStateDescription *vmsd, >> * not follow. >> */ >> inner_field = vmsd_create_fake_nullptr_field(field); >> + is_null = true; >> } else { >> inner_field = field; >> + is_null = false; >> + } >> + >> + /* >> + * Due to the fake nullptr handling above, if there's mixed >> + * null/non-null data, it doesn't make sense to emit a >> + * compressed array representation spanning the entire array >> + * because the field types will be different (e.g. struct >> + * vs. uint64_t). Search ahead for the next null/non-null >> + * element and start a new compressed array if found. >> + */ >> + if (field->flags & VMS_ARRAY_OF_POINTER && >> + is_null != is_prev_null) { >> + >> + is_prev_null = is_null; >> + vmdesc_loop = vmdesc; >> + >> + for (int j = i + 1; j < n_elems; j++) { >> + void *elem = *(void **)(first_elem + size * j); >> + bool elem_is_null = !elem && size; >> + >> + if (is_null != elem_is_null) { >> + max_elems = j - i; >> + break; >> + } >> + } >> } >> >> vmsd_desc_field_start(vmsd, vmdesc_loop, inner_field, >> - i, n_elems); >> + i, max_elems); >> >> if (inner_field->flags & VMS_STRUCT) { >> ret = vmstate_save_state(f, inner_field->vmsd, >> diff --git a/scripts/analyze-migration.py b/scripts/analyze-migration.py >> index 4836920ddc..9138e91a11 100755 >> --- a/scripts/analyze-migration.py >> +++ b/scripts/analyze-migration.py >> @@ -497,7 +497,14 @@ def read(self): >> raise Exception("internal index of data field unmatched (%d/%d)" % (len(a), int(field['index']))) >> a.append(field['data']) There's actually a bug here, the code above does: if len(a) != int(field['index']): raise Exception() Which only works with this patch because the compressed array happens to come first. >> else: >> - self.data[field['name']] = field['data'] >> + # There could be multiple entries for the same field >> + # name, e.g. when a compressed array was broken in >> + # more than one piece. >> + if (field['name'] in self.data and >> + type(self.data[field['name']]) == list): >> + self.data[field['name']].append(field['data']) >> + else: >> + self.data[field['name']] = field['data'] > > Do we realy need these script changes? I thought VMSDFieldStruct always > breaks array_len field into "index" based anyway? > > new_fields = [] > for field in self.desc['struct']['fields']: > if not 'array_len' in field: > new_fields.append(field) > continue > array_len = field.pop('array_len') > field['index'] = 0 > new_fields.append(field) > for i in range(1, array_len): > c = field.copy() > c['index'] = i > new_fields.append(c) > > self.desc['struct']['fields'] = new_fields This code is about decompressing the array, it doesn't handle multiple entries with the same name. See the JSON I posted up there. This makes the single: {"name": "css", "array_len": 254, "type": "uint8", "size": 1}, become multiple: {"name": "css", "index": 0, "type": "uint8", "size": 1}, {"name": "css", "index": 1, "type": "uint8", "size": 1}, ... {"name": "css", "index": 253, "type": "uint8", "size": 1},
On Wed, Jan 08, 2025 at 10:52:30AM -0300, Fabiano Rosas wrote: > Peter Xu <peterx@redhat.com> writes: > > > On Tue, Jan 07, 2025 at 04:50:24PM -0300, Fabiano Rosas wrote: > >> Currently, if an array of pointers contains a NULL pointer, that > >> pointer will be encoded as '0' in the stream. Since the JSON writer > >> doesn't define a "pointer" type, that '0' will now be an uint64, which > >> is different from the original type being pointed to, e.g. struct. > >> > >> That mixed-type array shouldn't be compressed, otherwise data is lost > >> as the code currently makes the whole array have the type of the first > >> element. > >> > >> While we could disable the array compression when a NULL pointer is > >> found, the JSON part of the stream still makes part of downtime, so we > >> should avoid writing unecessary bytes to it. > >> > >> Keep the array compression in place, but break the array into several > >> type-contiguous pieces if NULL and non-NULL pointers are mixed. > > > > Could I request for a sample JSON dump for an example array in the commit > > log? This whole solution looks working but is tricky. A sample could help > > people understand (e.g. showing the same "name" being dumped multiple > > times..). > > {"name": "s390_css", "instance_id": 0, "vmsd_name": "s390_css", > "version": 1, "fields": [ > ..., > {"name": "css", "array_len": 254, "type": "uint8", "size": 1}, > {"name": "css", "type": "struct", "struct": { > "vmsd_name": "s390_css_img", "version": 1, "fields": [{"name": > "chpids", "array_len": 256, "type": "struct", "struct": {"vmsd_name": > "s390_chp_info", "version": 1, "fields": [{"name": "in_use", "type": > "uint8", "size": 1}, {"name": "type", "type": "uint8", "size": 1}, > {"name": "is_virtual", "type": "uint8", "size": 1}]}, "size": 3}]}, > "size": 768}, > {"name": "css", "type": "uint8", "size": 1}, > ... > ]} Yes something like this would work, thanks. We could even omit most of the struct details but only show the important ones: {"name": "s390_css", "instance_id": 0, "vmsd_name": "s390_css", "version": 1, "fields": [ ..., {"name": "css", "array_len": 254, "type": "uint8", "size": 1}, {"name": "css", "type": "struct", "struct": {"vmsd_name": "s390_css_img", ... }, "size": 768}, {"name": "css", "type": "uint8", "size": 1}, ... ]} > > > > > Side note: I tried to dump a very basic VM's JSON out to disk, it scares me > > on the size: > > > > $ ls -lhS JSON.out > > -rw-r--r--. 1 peterx peterx 106K Jan 7 17:18 JSON.out > > > > That's a simplest VM with all default stuff, mostly nothing complex.. I may > > really need to measure how the JSON debug strings affect migration function > > or perf at some point.. > > > > Agreed. > > >> > >> Signed-off-by: Fabiano Rosas <farosas@suse.de> > >> --- > >> migration/vmstate.c | 33 ++++++++++++++++++++++++++++++++- > >> scripts/analyze-migration.py | 9 ++++++++- > >> 2 files changed, 40 insertions(+), 2 deletions(-) > >> > >> diff --git a/migration/vmstate.c b/migration/vmstate.c > >> index 52704c822c..a79ccf3875 100644 > >> --- a/migration/vmstate.c > >> +++ b/migration/vmstate.c > >> @@ -425,15 +425,19 @@ int vmstate_save_state_v(QEMUFile *f, const VMStateDescription *vmsd, > >> int size = vmstate_size(opaque, field); > >> uint64_t old_offset, written_bytes; > >> JSONWriter *vmdesc_loop = vmdesc; > >> + bool is_prev_null = false; > >> > >> trace_vmstate_save_state_loop(vmsd->name, field->name, n_elems); > >> if (field->flags & VMS_POINTER) { > >> first_elem = *(void **)first_elem; > >> assert(first_elem || !n_elems || !size); > >> } > >> + > >> for (i = 0; i < n_elems; i++) { > >> void *curr_elem = first_elem + size * i; > >> const VMStateField *inner_field; > >> + bool is_null; > >> + int max_elems = n_elems - i; > >> > >> old_offset = qemu_file_transferred(f); > >> if (field->flags & VMS_ARRAY_OF_POINTER) { > >> @@ -448,12 +452,39 @@ int vmstate_save_state_v(QEMUFile *f, const VMStateDescription *vmsd, > >> * not follow. > >> */ > >> inner_field = vmsd_create_fake_nullptr_field(field); > >> + is_null = true; > >> } else { > >> inner_field = field; > >> + is_null = false; > >> + } > >> + > >> + /* > >> + * Due to the fake nullptr handling above, if there's mixed > >> + * null/non-null data, it doesn't make sense to emit a > >> + * compressed array representation spanning the entire array > >> + * because the field types will be different (e.g. struct > >> + * vs. uint64_t). Search ahead for the next null/non-null > >> + * element and start a new compressed array if found. > >> + */ > >> + if (field->flags & VMS_ARRAY_OF_POINTER && > >> + is_null != is_prev_null) { > >> + > >> + is_prev_null = is_null; > >> + vmdesc_loop = vmdesc; > >> + > >> + for (int j = i + 1; j < n_elems; j++) { > >> + void *elem = *(void **)(first_elem + size * j); > >> + bool elem_is_null = !elem && size; > >> + > >> + if (is_null != elem_is_null) { > >> + max_elems = j - i; > >> + break; > >> + } > >> + } > >> } > >> > >> vmsd_desc_field_start(vmsd, vmdesc_loop, inner_field, > >> - i, n_elems); > >> + i, max_elems); > >> > >> if (inner_field->flags & VMS_STRUCT) { > >> ret = vmstate_save_state(f, inner_field->vmsd, > >> diff --git a/scripts/analyze-migration.py b/scripts/analyze-migration.py > >> index 4836920ddc..9138e91a11 100755 > >> --- a/scripts/analyze-migration.py > >> +++ b/scripts/analyze-migration.py > >> @@ -497,7 +497,14 @@ def read(self): > >> raise Exception("internal index of data field unmatched (%d/%d)" % (len(a), int(field['index']))) > >> a.append(field['data']) > > There's actually a bug here, the code above does: > > if len(a) != int(field['index']): > raise Exception() > > Which only works with this patch because the compressed array happens to > come first. I think it will work no matter how it's ordered after your patch? IOW I'd hope it'll keep working if the 1st is a nullptr: {"name": "css", "type": "uint8", "size": 1}, {"name": "css", "type": "struct", "struct": {"vmsd_name": "s390_css_img", ... }, "size": 768}, {"name": "css", "array_len": 254, "type": "uint8", "size": 1}, Because IIUC the python script will parse each of the lines above into a VMSD field. > > >> else: > >> - self.data[field['name']] = field['data'] > >> + # There could be multiple entries for the same field > >> + # name, e.g. when a compressed array was broken in > >> + # more than one piece. > >> + if (field['name'] in self.data and > >> + type(self.data[field['name']]) == list): > >> + self.data[field['name']].append(field['data']) > >> + else: > >> + self.data[field['name']] = field['data'] > > > > Do we realy need these script changes? I thought VMSDFieldStruct always > > breaks array_len field into "index" based anyway? > > > > new_fields = [] > > for field in self.desc['struct']['fields']: > > if not 'array_len' in field: > > new_fields.append(field) > > continue > > array_len = field.pop('array_len') > > field['index'] = 0 > > new_fields.append(field) > > for i in range(1, array_len): > > c = field.copy() > > c['index'] = i > > new_fields.append(c) > > > > self.desc['struct']['fields'] = new_fields > > This code is about decompressing the array, it doesn't handle multiple > entries with the same name. See the JSON I posted up there. > > This makes the single: > > {"name": "css", "array_len": 254, "type": "uint8", "size": 1}, > > become multiple: > > {"name": "css", "index": 0, "type": "uint8", "size": 1}, > {"name": "css", "index": 1, "type": "uint8", "size": 1}, > ... > {"name": "css", "index": 253, "type": "uint8", "size": 1}, Correct. I think that means for each of the break-down entries there'll be an "index" if it's an array. What you changed above is the case where "index" is not available, which is processing the non-array entry. Why does that need change? What happens if you run this without the python part you changed in this patch?
Peter Xu <peterx@redhat.com> writes: > On Wed, Jan 08, 2025 at 10:52:30AM -0300, Fabiano Rosas wrote: >> Peter Xu <peterx@redhat.com> writes: >> >> > On Tue, Jan 07, 2025 at 04:50:24PM -0300, Fabiano Rosas wrote: >> >> Currently, if an array of pointers contains a NULL pointer, that >> >> pointer will be encoded as '0' in the stream. Since the JSON writer >> >> doesn't define a "pointer" type, that '0' will now be an uint64, which >> >> is different from the original type being pointed to, e.g. struct. >> >> >> >> That mixed-type array shouldn't be compressed, otherwise data is lost >> >> as the code currently makes the whole array have the type of the first >> >> element. >> >> >> >> While we could disable the array compression when a NULL pointer is >> >> found, the JSON part of the stream still makes part of downtime, so we >> >> should avoid writing unecessary bytes to it. >> >> >> >> Keep the array compression in place, but break the array into several >> >> type-contiguous pieces if NULL and non-NULL pointers are mixed. >> > >> > Could I request for a sample JSON dump for an example array in the commit >> > log? This whole solution looks working but is tricky. A sample could help >> > people understand (e.g. showing the same "name" being dumped multiple >> > times..). >> >> {"name": "s390_css", "instance_id": 0, "vmsd_name": "s390_css", >> "version": 1, "fields": [ >> ..., >> {"name": "css", "array_len": 254, "type": "uint8", "size": 1}, >> {"name": "css", "type": "struct", "struct": { >> "vmsd_name": "s390_css_img", "version": 1, "fields": [{"name": >> "chpids", "array_len": 256, "type": "struct", "struct": {"vmsd_name": >> "s390_chp_info", "version": 1, "fields": [{"name": "in_use", "type": >> "uint8", "size": 1}, {"name": "type", "type": "uint8", "size": 1}, >> {"name": "is_virtual", "type": "uint8", "size": 1}]}, "size": 3}]}, >> "size": 768}, >> {"name": "css", "type": "uint8", "size": 1}, >> ... >> ]} > > Yes something like this would work, thanks. We could even omit most of the > struct details but only show the important ones: > > {"name": "s390_css", "instance_id": 0, "vmsd_name": "s390_css", > "version": 1, "fields": [ > ..., > {"name": "css", "array_len": 254, "type": "uint8", "size": 1}, > {"name": "css", "type": "struct", "struct": {"vmsd_name": "s390_css_img", ... }, "size": 768}, > {"name": "css", "type": "uint8", "size": 1}, > ... > ]} > >> >> > >> > Side note: I tried to dump a very basic VM's JSON out to disk, it scares me >> > on the size: >> > >> > $ ls -lhS JSON.out >> > -rw-r--r--. 1 peterx peterx 106K Jan 7 17:18 JSON.out >> > >> > That's a simplest VM with all default stuff, mostly nothing complex.. I may >> > really need to measure how the JSON debug strings affect migration function >> > or perf at some point.. >> > >> >> Agreed. >> >> >> >> >> Signed-off-by: Fabiano Rosas <farosas@suse.de> >> >> --- >> >> migration/vmstate.c | 33 ++++++++++++++++++++++++++++++++- >> >> scripts/analyze-migration.py | 9 ++++++++- >> >> 2 files changed, 40 insertions(+), 2 deletions(-) >> >> >> >> diff --git a/migration/vmstate.c b/migration/vmstate.c >> >> index 52704c822c..a79ccf3875 100644 >> >> --- a/migration/vmstate.c >> >> +++ b/migration/vmstate.c >> >> @@ -425,15 +425,19 @@ int vmstate_save_state_v(QEMUFile *f, const VMStateDescription *vmsd, >> >> int size = vmstate_size(opaque, field); >> >> uint64_t old_offset, written_bytes; >> >> JSONWriter *vmdesc_loop = vmdesc; >> >> + bool is_prev_null = false; >> >> >> >> trace_vmstate_save_state_loop(vmsd->name, field->name, n_elems); >> >> if (field->flags & VMS_POINTER) { >> >> first_elem = *(void **)first_elem; >> >> assert(first_elem || !n_elems || !size); >> >> } >> >> + >> >> for (i = 0; i < n_elems; i++) { >> >> void *curr_elem = first_elem + size * i; >> >> const VMStateField *inner_field; >> >> + bool is_null; >> >> + int max_elems = n_elems - i; >> >> >> >> old_offset = qemu_file_transferred(f); >> >> if (field->flags & VMS_ARRAY_OF_POINTER) { >> >> @@ -448,12 +452,39 @@ int vmstate_save_state_v(QEMUFile *f, const VMStateDescription *vmsd, >> >> * not follow. >> >> */ >> >> inner_field = vmsd_create_fake_nullptr_field(field); >> >> + is_null = true; >> >> } else { >> >> inner_field = field; >> >> + is_null = false; >> >> + } >> >> + >> >> + /* >> >> + * Due to the fake nullptr handling above, if there's mixed >> >> + * null/non-null data, it doesn't make sense to emit a >> >> + * compressed array representation spanning the entire array >> >> + * because the field types will be different (e.g. struct >> >> + * vs. uint64_t). Search ahead for the next null/non-null >> >> + * element and start a new compressed array if found. >> >> + */ >> >> + if (field->flags & VMS_ARRAY_OF_POINTER && >> >> + is_null != is_prev_null) { >> >> + >> >> + is_prev_null = is_null; >> >> + vmdesc_loop = vmdesc; >> >> + >> >> + for (int j = i + 1; j < n_elems; j++) { >> >> + void *elem = *(void **)(first_elem + size * j); >> >> + bool elem_is_null = !elem && size; >> >> + >> >> + if (is_null != elem_is_null) { >> >> + max_elems = j - i; >> >> + break; >> >> + } >> >> + } >> >> } >> >> >> >> vmsd_desc_field_start(vmsd, vmdesc_loop, inner_field, >> >> - i, n_elems); >> >> + i, max_elems); >> >> >> >> if (inner_field->flags & VMS_STRUCT) { >> >> ret = vmstate_save_state(f, inner_field->vmsd, >> >> diff --git a/scripts/analyze-migration.py b/scripts/analyze-migration.py >> >> index 4836920ddc..9138e91a11 100755 >> >> --- a/scripts/analyze-migration.py >> >> +++ b/scripts/analyze-migration.py >> >> @@ -497,7 +497,14 @@ def read(self): >> >> raise Exception("internal index of data field unmatched (%d/%d)" % (len(a), int(field['index']))) >> >> a.append(field['data']) >> >> There's actually a bug here, the code above does: >> >> if len(a) != int(field['index']): >> raise Exception() >> >> Which only works with this patch because the compressed array happens to >> come first. > > I think it will work no matter how it's ordered after your patch? IOW I'd > hope it'll keep working if the 1st is a nullptr: > > {"name": "css", "type": "uint8", "size": 1}, > {"name": "css", "type": "struct", "struct": {"vmsd_name": "s390_css_img", ... }, "size": 768}, > {"name": "css", "array_len": 254, "type": "uint8", "size": 1}, > > Because IIUC the python script will parse each of the lines above into a > VMSD field. Yes, but all fields go into self.data of the VMSDFieldStruct, so self.data["css"] will increase beyond the size of the array. >> >> >> else: >> >> - self.data[field['name']] = field['data'] >> >> + # There could be multiple entries for the same field >> >> + # name, e.g. when a compressed array was broken in >> >> + # more than one piece. >> >> + if (field['name'] in self.data and >> >> + type(self.data[field['name']]) == list): >> >> + self.data[field['name']].append(field['data']) >> >> + else: >> >> + self.data[field['name']] = field['data'] >> > >> > Do we realy need these script changes? I thought VMSDFieldStruct always >> > breaks array_len field into "index" based anyway? >> > >> > new_fields = [] >> > for field in self.desc['struct']['fields']: >> > if not 'array_len' in field: >> > new_fields.append(field) >> > continue >> > array_len = field.pop('array_len') >> > field['index'] = 0 >> > new_fields.append(field) >> > for i in range(1, array_len): >> > c = field.copy() >> > c['index'] = i >> > new_fields.append(c) >> > >> > self.desc['struct']['fields'] = new_fields >> >> This code is about decompressing the array, it doesn't handle multiple >> entries with the same name. See the JSON I posted up there. >> >> This makes the single: >> >> {"name": "css", "array_len": 254, "type": "uint8", "size": 1}, >> >> become multiple: >> >> {"name": "css", "index": 0, "type": "uint8", "size": 1}, >> {"name": "css", "index": 1, "type": "uint8", "size": 1}, >> ... >> {"name": "css", "index": 253, "type": "uint8", "size": 1}, > > Correct. > > I think that means for each of the break-down entries there'll be an > "index" if it's an array. What you changed above is the case where "index" > is not available, which is processing the non-array entry. Why does that > need change? It needs to append to, not overwrite the previous self.data[name] > What happens if you run this without the python part you > changed in this patch? The last nullptr overwrites everything else: "s390_css (14)": { "pending_crws": "00", "sei_pending": false, "do_crw_mchk": true, "crws_lost": false, "max_cssid": "0x00", "max_ssid": "0x00", "chnmon_active": false, "chnmon_area": "0x0000000000000000", --> "css": "nullptr", "default_cssid": "0xfe" },
On Wed, Jan 08, 2025 at 02:15:19PM -0300, Fabiano Rosas wrote: > >> >> else: > >> >> - self.data[field['name']] = field['data'] > >> >> + # There could be multiple entries for the same field > >> >> + # name, e.g. when a compressed array was broken in > >> >> + # more than one piece. > >> >> + if (field['name'] in self.data and > >> >> + type(self.data[field['name']]) == list): > >> >> + self.data[field['name']].append(field['data']) > >> >> + else: > >> >> + self.data[field['name']] = field['data'] [...] > The last nullptr overwrites everything else: > > "s390_css (14)": { > "pending_crws": "00", > "sei_pending": false, > "do_crw_mchk": true, > "crws_lost": false, > "max_cssid": "0x00", > "max_ssid": "0x00", > "chnmon_active": false, > "chnmon_area": "0x0000000000000000", > --> "css": "nullptr", > "default_cssid": "0xfe" > }, Oh I see what you meant.. Then I am guessing the current change may not always work, e.g. when the 1st entry only contains one element rather than an array, like: {"name": "css", "type": "uint8", "size": 1}, {"name": "css", "type": "struct", "struct": {"vmsd_name": "s390_css_img", ... }, "size": 768}, {"name": "css", "array_len": 254, "type": "uint8", "size": 1}, Here we may need something like: name = field['name'] if (name in self.data): if (type(self.data[name]) is not list): self.data[name] = [self.data[name]] self.data[name].append(field['data']) else: self.data[name] = field['data']
diff --git a/migration/vmstate.c b/migration/vmstate.c index 52704c822c..a79ccf3875 100644 --- a/migration/vmstate.c +++ b/migration/vmstate.c @@ -425,15 +425,19 @@ int vmstate_save_state_v(QEMUFile *f, const VMStateDescription *vmsd, int size = vmstate_size(opaque, field); uint64_t old_offset, written_bytes; JSONWriter *vmdesc_loop = vmdesc; + bool is_prev_null = false; trace_vmstate_save_state_loop(vmsd->name, field->name, n_elems); if (field->flags & VMS_POINTER) { first_elem = *(void **)first_elem; assert(first_elem || !n_elems || !size); } + for (i = 0; i < n_elems; i++) { void *curr_elem = first_elem + size * i; const VMStateField *inner_field; + bool is_null; + int max_elems = n_elems - i; old_offset = qemu_file_transferred(f); if (field->flags & VMS_ARRAY_OF_POINTER) { @@ -448,12 +452,39 @@ int vmstate_save_state_v(QEMUFile *f, const VMStateDescription *vmsd, * not follow. */ inner_field = vmsd_create_fake_nullptr_field(field); + is_null = true; } else { inner_field = field; + is_null = false; + } + + /* + * Due to the fake nullptr handling above, if there's mixed + * null/non-null data, it doesn't make sense to emit a + * compressed array representation spanning the entire array + * because the field types will be different (e.g. struct + * vs. uint64_t). Search ahead for the next null/non-null + * element and start a new compressed array if found. + */ + if (field->flags & VMS_ARRAY_OF_POINTER && + is_null != is_prev_null) { + + is_prev_null = is_null; + vmdesc_loop = vmdesc; + + for (int j = i + 1; j < n_elems; j++) { + void *elem = *(void **)(first_elem + size * j); + bool elem_is_null = !elem && size; + + if (is_null != elem_is_null) { + max_elems = j - i; + break; + } + } } vmsd_desc_field_start(vmsd, vmdesc_loop, inner_field, - i, n_elems); + i, max_elems); if (inner_field->flags & VMS_STRUCT) { ret = vmstate_save_state(f, inner_field->vmsd, diff --git a/scripts/analyze-migration.py b/scripts/analyze-migration.py index 4836920ddc..9138e91a11 100755 --- a/scripts/analyze-migration.py +++ b/scripts/analyze-migration.py @@ -497,7 +497,14 @@ def read(self): raise Exception("internal index of data field unmatched (%d/%d)" % (len(a), int(field['index']))) a.append(field['data']) else: - self.data[field['name']] = field['data'] + # There could be multiple entries for the same field + # name, e.g. when a compressed array was broken in + # more than one piece. + if (field['name'] in self.data and + type(self.data[field['name']]) == list): + self.data[field['name']].append(field['data']) + else: + self.data[field['name']] = field['data'] if 'subsections' in self.desc['struct']: for subsection in self.desc['struct']['subsections']:
Currently, if an array of pointers contains a NULL pointer, that pointer will be encoded as '0' in the stream. Since the JSON writer doesn't define a "pointer" type, that '0' will now be an uint64, which is different from the original type being pointed to, e.g. struct. That mixed-type array shouldn't be compressed, otherwise data is lost as the code currently makes the whole array have the type of the first element. While we could disable the array compression when a NULL pointer is found, the JSON part of the stream still makes part of downtime, so we should avoid writing unecessary bytes to it. Keep the array compression in place, but break the array into several type-contiguous pieces if NULL and non-NULL pointers are mixed. Signed-off-by: Fabiano Rosas <farosas@suse.de> --- migration/vmstate.c | 33 ++++++++++++++++++++++++++++++++- scripts/analyze-migration.py | 9 ++++++++- 2 files changed, 40 insertions(+), 2 deletions(-)