diff mbox series

[5/6] migration: Rephrase message on failure to save / load Xen device state

Message ID 20240513141703.549874-6-armbru@redhat.com (mailing list archive)
State New, archived
Headers show
Series error: Eliminate QERR_IO_ERROR | expand

Commit Message

Markus Armbruster May 13, 2024, 2:17 p.m. UTC
Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job.  When the caller does, the error is reported twice.  When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.

qmp_xen_save_devices_state() and qmp_xen_load_devices_state() violate
this principle: they call qemu_save_device_state() and
qemu_loadvm_state(), which call error_report_err().

I wish I could clean this up now, but migration's error reporting is
too complicated (confused?) for me to mess with it.

Instead, I'm merely improving the error reported by
qmp_xen_load_devices_state() and qmp_xen_load_devices_state() to the
QMP core from

    An IO error has occurred

to
    saving Xen device state failed

and

    loading Xen device state failed

respectively.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
---
 migration/savevm.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

Comments

Philippe Mathieu-Daudé May 13, 2024, 2:24 p.m. UTC | #1
On 13/5/24 16:17, Markus Armbruster wrote:
> Functions that use an Error **errp parameter to return errors should
> not also report them to the user, because reporting is the caller's
> job.  When the caller does, the error is reported twice.  When it
> doesn't (because it recovered from the error), there is no error to
> report, i.e. the report is bogus.
> 
> qmp_xen_save_devices_state() and qmp_xen_load_devices_state() violate
> this principle: they call qemu_save_device_state() and
> qemu_loadvm_state(), which call error_report_err().
> 
> I wish I could clean this up now, but migration's error reporting is
> too complicated (confused?) for me to mess with it.
> 
> Instead, I'm merely improving the error reported by
> qmp_xen_load_devices_state() and qmp_xen_load_devices_state() to the
> QMP core from
> 
>      An IO error has occurred
> 
> to
>      saving Xen device state failed
> 
> and
> 
>      loading Xen device state failed
> 
> respectively.
> 
> Signed-off-by: Markus Armbruster <armbru@redhat.com>
> ---
>   migration/savevm.c | 5 ++---
>   1 file changed, 2 insertions(+), 3 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Fabiano Rosas May 13, 2024, 6:07 p.m. UTC | #2
Markus Armbruster <armbru@redhat.com> writes:

> Functions that use an Error **errp parameter to return errors should
> not also report them to the user, because reporting is the caller's
> job.  When the caller does, the error is reported twice.  When it
> doesn't (because it recovered from the error), there is no error to
> report, i.e. the report is bogus.
>
> qmp_xen_save_devices_state() and qmp_xen_load_devices_state() violate
> this principle: they call qemu_save_device_state() and
> qemu_loadvm_state(), which call error_report_err().
>
> I wish I could clean this up now, but migration's error reporting is
> too complicated (confused?) for me to mess with it.
>
> Instead, I'm merely improving the error reported by
> qmp_xen_load_devices_state() and qmp_xen_load_devices_state() to the
> QMP core from
>
>     An IO error has occurred
>
> to
>     saving Xen device state failed
>
> and
>
>     loading Xen device state failed
>
> respectively.
>
> Signed-off-by: Markus Armbruster <armbru@redhat.com>

Acked-by: Fabiano Rosas <farosas@suse.de>
Peter Xu May 23, 2024, 7:43 p.m. UTC | #3
On Mon, May 13, 2024 at 04:17:02PM +0200, Markus Armbruster wrote:
> Functions that use an Error **errp parameter to return errors should
> not also report them to the user, because reporting is the caller's
> job.  When the caller does, the error is reported twice.  When it
> doesn't (because it recovered from the error), there is no error to
> report, i.e. the report is bogus.
> 
> qmp_xen_save_devices_state() and qmp_xen_load_devices_state() violate
> this principle: they call qemu_save_device_state() and
> qemu_loadvm_state(), which call error_report_err().
> 
> I wish I could clean this up now, but migration's error reporting is
> too complicated (confused?) for me to mess with it.

:-(

> 
> Instead, I'm merely improving the error reported by
> qmp_xen_load_devices_state() and qmp_xen_load_devices_state() to the
> QMP core from
> 
>     An IO error has occurred
> 
> to
>     saving Xen device state failed
> 
> and
> 
>     loading Xen device state failed
> 
> respectively.
> 
> Signed-off-by: Markus Armbruster <armbru@redhat.com>

Acked-by: Peter Xu <peterx@redhat.com>
Markus Armbruster May 27, 2024, 10:53 a.m. UTC | #4
Peter Xu <peterx@redhat.com> writes:

> On Mon, May 13, 2024 at 04:17:02PM +0200, Markus Armbruster wrote:
>> Functions that use an Error **errp parameter to return errors should
>> not also report them to the user, because reporting is the caller's
>> job.  When the caller does, the error is reported twice.  When it
>> doesn't (because it recovered from the error), there is no error to
>> report, i.e. the report is bogus.
>> 
>> qmp_xen_save_devices_state() and qmp_xen_load_devices_state() violate
>> this principle: they call qemu_save_device_state() and
>> qemu_loadvm_state(), which call error_report_err().
>> 
>> I wish I could clean this up now, but migration's error reporting is
>> too complicated (confused?) for me to mess with it.
>
> :-(

If I understood how it's *supposed* to work, I might have a chance...

I can see a mixture of reporting errors directly (with error_report() &
friends), passing them to callers (via Error **errp), and storing them
in / retrieving them from MigrationState member @error.  This can't be
right.

I think a necessary first step towards getting it right is a shared
understanding how errors are to be handled in migration code.  This
includes how error data should flow from error source to error sink, and
what the possible sinks are.

>> Instead, I'm merely improving the error reported by
>> qmp_xen_load_devices_state() and qmp_xen_load_devices_state() to the
>> QMP core from
>> 
>>     An IO error has occurred
>> 
>> to
>>     saving Xen device state failed
>> 
>> and
>> 
>>     loading Xen device state failed
>> 
>> respectively.
>> 
>> Signed-off-by: Markus Armbruster <armbru@redhat.com>
>
> Acked-by: Peter Xu <peterx@redhat.com>

Thanks!
Peter Xu May 27, 2024, 2:35 p.m. UTC | #5
On Mon, May 27, 2024 at 12:53:22PM +0200, Markus Armbruster wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > On Mon, May 13, 2024 at 04:17:02PM +0200, Markus Armbruster wrote:
> >> Functions that use an Error **errp parameter to return errors should
> >> not also report them to the user, because reporting is the caller's
> >> job.  When the caller does, the error is reported twice.  When it
> >> doesn't (because it recovered from the error), there is no error to
> >> report, i.e. the report is bogus.
> >> 
> >> qmp_xen_save_devices_state() and qmp_xen_load_devices_state() violate
> >> this principle: they call qemu_save_device_state() and
> >> qemu_loadvm_state(), which call error_report_err().
> >> 
> >> I wish I could clean this up now, but migration's error reporting is
> >> too complicated (confused?) for me to mess with it.
> >
> > :-(
> 
> If I understood how it's *supposed* to work, I might have a chance...
> 
> I can see a mixture of reporting errors directly (with error_report() &
> friends), passing them to callers (via Error **errp), and storing them
> in / retrieving them from MigrationState member @error.  This can't be
> right.
> 
> I think a necessary first step towards getting it right is a shared
> understanding how errors are to be handled in migration code.  This
> includes how error data should flow from error source to error sink, and
> what the possible sinks are.

True.  I think the sink should always be MigrationState.error so far.

There's also the other complexity on detecting errors using either
qemu_file_get_error() or migrate_get_error().. the error handling in
migration is indeed prone to a cleanup.

I've added a cleanup entry for migration todo page:

https://wiki.qemu.org/ToDo/LiveMigration#Migration_error_detection_and_reporting

Thanks,
diff mbox series

Patch

diff --git a/migration/savevm.c b/migration/savevm.c
index 4509482ec4..a4a856982a 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -45,7 +45,6 @@ 
 #include "qapi/qapi-commands-migration.h"
 #include "qapi/clone-visitor.h"
 #include "qapi/qapi-builtin-visit.h"
-#include "qapi/qmp/qerror.h"
 #include "qemu/error-report.h"
 #include "sysemu/cpus.h"
 #include "exec/memory.h"
@@ -3208,7 +3207,7 @@  void qmp_xen_save_devices_state(const char *filename, bool has_live, bool live,
     object_unref(OBJECT(ioc));
     ret = qemu_save_device_state(f);
     if (ret < 0 || qemu_fclose(f) < 0) {
-        error_setg(errp, QERR_IO_ERROR);
+        error_setg(errp, "saving Xen device state failed");
     } else {
         /* libxl calls the QMP command "stop" before calling
          * "xen-save-devices-state" and in case of migration failure, libxl
@@ -3257,7 +3256,7 @@  void qmp_xen_load_devices_state(const char *filename, Error **errp)
     ret = qemu_loadvm_state(f);
     qemu_fclose(f);
     if (ret < 0) {
-        error_setg(errp, QERR_IO_ERROR);
+        error_setg(errp, "loading Xen device state failed");
     }
     migration_incoming_state_destroy();
 }