Message ID | f307829a5e64121b5cb8ad1aefff09f41cac9699.1567070002.git.mprivozn@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | qapi: Reintroduce CommandDisabled error class | expand |
Michal Privoznik <mprivozn@redhat.com> writes: > If there was a disabled command, then qemu-ga used to report > CommandDisabled error class (among with human readable > description). This changed in v1.2.0-rc0~28^2~16 in favor of > GenericError class. Really? I believe it was slightly earlier in the same series: 93b91c59db qemu-ga: switch to the new error format on the wire de253f1491 qmp: switch to the new error format on the wire The commit you mention (df1e608a01e) is merely follow-up simplification. > While the change might work for other > classes, this one should not have been dropped because it helps > callers distinguish the root cause of the error. > > A bit of background: up until very recently libvirt used qemu-ga > in all or nothing way. It didn't care why a qemu-ga command > failed. But very recently a new API was introduced which > implements 'best effort' approach (in some cases) and thus > libvirt must differentiate between: {CommandNotFound, > CommandDisabled} and some generic error. While the former classes > mean the API can issue some other commands the latter raises a > red flag causing the API to fail. Why do you need to distinguish CommandNotFound from CommandDisabled? > This reverts df1e608a01 partially. > > Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
On 8/29/19 2:10 PM, Markus Armbruster wrote: > Michal Privoznik <mprivozn@redhat.com> writes: > >> If there was a disabled command, then qemu-ga used to report >> CommandDisabled error class (among with human readable >> description). This changed in v1.2.0-rc0~28^2~16 in favor of >> GenericError class. > > Really? I believe it was slightly earlier in the same series: > > 93b91c59db qemu-ga: switch to the new error format on the wire > de253f1491 qmp: switch to the new error format on the wire Ah, you're right. It's the first commit that you reference. > > The commit you mention (df1e608a01e) is merely follow-up simplification. > >> While the change might work for other >> classes, this one should not have been dropped because it helps >> callers distinguish the root cause of the error. >> >> A bit of background: up until very recently libvirt used qemu-ga >> in all or nothing way. It didn't care why a qemu-ga command >> failed. But very recently a new API was introduced which >> implements 'best effort' approach (in some cases) and thus >> libvirt must differentiate between: {CommandNotFound, >> CommandDisabled} and some generic error. While the former classes >> mean the API can issue some other commands the latter raises a >> red flag causing the API to fail. > > Why do you need to distinguish CommandNotFound from CommandDisabled? I don't. That's why I've put them both in curly braces. Perhaps this says its better: switch (klass) { case CommandNotFound: case CommandDisabled: /* okay */ break; default: /* bad, error out */ break; } > >> This reverts df1e608a01 partially. >> >> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Michal
On 8/29/19 8:04 AM, Michal Privoznik wrote: >>> A bit of background: up until very recently libvirt used qemu-ga >>> in all or nothing way. It didn't care why a qemu-ga command >>> failed. But very recently a new API was introduced which >>> implements 'best effort' approach (in some cases) and thus >>> libvirt must differentiate between: {CommandNotFound, >>> CommandDisabled} and some generic error. While the former classes >>> mean the API can issue some other commands the latter raises a >>> red flag causing the API to fail. >> >> Why do you need to distinguish CommandNotFound from CommandDisabled? > > I don't. That's why I've put them both in curly braces. Perhaps this > says its better: > > switch (klass) { > case CommandNotFound: > case CommandDisabled: > /* okay */ > break; > So the obvious counter-question - why not use class CommandNotFound for a command that was disabled, rather than readding another class that has no distinctive purpose?
On 8/29/19 3:12 PM, Eric Blake wrote: > On 8/29/19 8:04 AM, Michal Privoznik wrote: > >>>> A bit of background: up until very recently libvirt used qemu-ga >>>> in all or nothing way. It didn't care why a qemu-ga command >>>> failed. But very recently a new API was introduced which >>>> implements 'best effort' approach (in some cases) and thus >>>> libvirt must differentiate between: {CommandNotFound, >>>> CommandDisabled} and some generic error. While the former classes >>>> mean the API can issue some other commands the latter raises a >>>> red flag causing the API to fail. >>> >>> Why do you need to distinguish CommandNotFound from CommandDisabled? >> >> I don't. That's why I've put them both in curly braces. Perhaps this >> says its better: >> >> switch (klass) { >> case CommandNotFound: >> case CommandDisabled: >> /* okay */ >> break; >> > > So the obvious counter-question - why not use class CommandNotFound for > a command that was disabled, rather than readding another class that has > no distinctive purpose? > > Because disabling a command is not the same as nonexistent command. While a command can be disabled by user/sysadmin, they are disabled at runtime by qemu-ga itself for a short period of time (e.g. on FS freeze some commands are disabled - typically those which require write disk access). And I guess reporting CommandNotFound for a command that does exist only is disabled temporarily doesn't reflect the reality, does it? On the other hand, CommandNotFound would fix the issue for libvirt, so if you don't want to invent a new error class, then that's the way to go. Michal
Michal Privoznik <mprivozn@redhat.com> writes: > On 8/29/19 3:12 PM, Eric Blake wrote: >> On 8/29/19 8:04 AM, Michal Privoznik wrote: >> >>>>> A bit of background: up until very recently libvirt used qemu-ga >>>>> in all or nothing way. It didn't care why a qemu-ga command >>>>> failed. But very recently a new API was introduced which >>>>> implements 'best effort' approach (in some cases) and thus >>>>> libvirt must differentiate between: {CommandNotFound, >>>>> CommandDisabled} and some generic error. While the former classes >>>>> mean the API can issue some other commands the latter raises a >>>>> red flag causing the API to fail. >>>> >>>> Why do you need to distinguish CommandNotFound from CommandDisabled? >>> >>> I don't. That's why I've put them both in curly braces. Perhaps this >>> says its better: >>> >>> switch (klass) { >>> case CommandNotFound: >>> case CommandDisabled: >>> /* okay */ >>> break; >>> >> >> So the obvious counter-question - why not use class CommandNotFound for >> a command that was disabled, rather than readding another class that has >> no distinctive purpose? >> >> > > Because disabling a command is not the same as nonexistent > command. While a command can be disabled by user/sysadmin, they are > disabled at runtime by qemu-ga itself for a short period of time > (e.g. on FS freeze some commands are disabled - typically those which > require write disk access). And I guess reporting CommandNotFound for > a command that does exist only is disabled temporarily doesn't reflect > the reality, does it? > > On the other hand, CommandNotFound would fix the issue for libvirt, so > if you don't want to invent a new error class, then that's the way to > go. I'm fine with changing the error to CommandNotFound. I'm reluctant to add back CommandDisabled. I doubt it's necessary. To arrive at an informed opinion, I had to figure out how this command disablement stuff works. I can just as well send it out, so here goes. Let's review our command disable feature. Commands are enabled on registration, see qmp_register_command(). To disable, call qmp_disable_command(). Only qga/main.c does, in two places: * ga_disable_non_whitelisted(): disable all commands except for ga_freeze_whitelist[], which is documented as /* commands that are safe to issue while filesystems are frozen */ * initialize_agent(): disable blacklisted commands. I figure these are the ones blacklisted with -b, plus commands blacklisted due to build configuration. The latter feels inappropriate; we should use QAPI schema conditionals to compile them out instead (QAPI conditionals didn't exist when the blacklisting code was written). Disabled commands can be re-enabled with qmp_enable_command(). Only qga/main.c does, in ga_enable_non_blacklisted(). I figure it re-enables the commands ga_disable_non_whitelisted() disables. Gets called when guest-fsfreeze-freeze freezes nothing[1], and when guest-fsfreeze-thaw succeeds[2]. Command dispatch fails when the command is disabled, in do_qmp_dispatch(). The proposed patch changes the error reply. QGA's guest-info shows whether a command is disabled (GuestAgentCommandInfo member @enabled, set in qmp_command_info()). QMP's query-commands skips disabled commands, in query_commands_cb(). Dead, as nothing ever disables QMP commands. Skipping feels like a bad idea anyway. Analysis: There are three kinds of disabled commands: compile-time (should be compiled out instead), permanently blacklisted with -b, temporarily disabled while filesystems are frozen. There are two states: thawed (first two kinds disabled) and frozen (all three kinds disabled). Command guest-fsfreeze-freeze[3] goes to state frozen or else fails. Command guest-fsfreeze-thaw goes to state thawed or else fails. guest-fsfreeze-status reports the state. Note that the transition to frozen (and thus the temporary command disablement) is under the control of the QGA client. There is no TOCTTOU between guest-info telling you which commands are disabled and executing the next command. My point is: the client can figure out whether a command is disabled before executing it. Of course, that doesn't mean we should make it figure it out. [1] POSIX only, WTF? [2] Except for execute_fsfreeze_hook(), which can still fail the command on POSIX, WTF? [3] guest-fsfreeze-freeze's doc comment notes "The frozen state is limited for up to 10 seconds by VSS." Sounds like some spontaneous transition back to thawed. If this is actually true, GAState member @frozen is not updated to reflect the spontaneous thaw. WTF?
On 8/30/19 1:52 PM, Markus Armbruster wrote: > Michal Privoznik <mprivozn@redhat.com> writes: > >> On 8/29/19 3:12 PM, Eric Blake wrote: >>> On 8/29/19 8:04 AM, Michal Privoznik wrote: >>> >>>>>> A bit of background: up until very recently libvirt used qemu-ga >>>>>> in all or nothing way. It didn't care why a qemu-ga command >>>>>> failed. But very recently a new API was introduced which >>>>>> implements 'best effort' approach (in some cases) and thus >>>>>> libvirt must differentiate between: {CommandNotFound, >>>>>> CommandDisabled} and some generic error. While the former classes >>>>>> mean the API can issue some other commands the latter raises a >>>>>> red flag causing the API to fail. >>>>> >>>>> Why do you need to distinguish CommandNotFound from CommandDisabled? >>>> >>>> I don't. That's why I've put them both in curly braces. Perhaps this >>>> says its better: >>>> >>>> switch (klass) { >>>> case CommandNotFound: >>>> case CommandDisabled: >>>> /* okay */ >>>> break; >>>> >>> >>> So the obvious counter-question - why not use class CommandNotFound for >>> a command that was disabled, rather than readding another class that has >>> no distinctive purpose? >>> >>> >> >> Because disabling a command is not the same as nonexistent >> command. While a command can be disabled by user/sysadmin, they are >> disabled at runtime by qemu-ga itself for a short period of time >> (e.g. on FS freeze some commands are disabled - typically those which >> require write disk access). And I guess reporting CommandNotFound for >> a command that does exist only is disabled temporarily doesn't reflect >> the reality, does it? >> >> On the other hand, CommandNotFound would fix the issue for libvirt, so >> if you don't want to invent a new error class, then that's the way to >> go. > > I'm fine with changing the error to CommandNotFound. > > I'm reluctant to add back CommandDisabled. I doubt it's necessary. > > To arrive at an informed opinion, I had to figure out how this command > disablement stuff works. I can just as well send it out, so here goes. > > Let's review our command disable feature. > > Commands are enabled on registration, see qmp_register_command(). > > To disable, call qmp_disable_command(). Only qga/main.c does, in two > places: > > * ga_disable_non_whitelisted(): disable all commands except for > ga_freeze_whitelist[], which is documented as /* commands that are > safe to issue while filesystems are frozen */ > > * initialize_agent(): disable blacklisted commands. I figure these are > the ones blacklisted with -b, plus commands blacklisted due to build > configuration. The latter feels inappropriate; we should use QAPI > schema conditionals to compile them out instead (QAPI conditionals > didn't exist when the blacklisting code was written). > > Disabled commands can be re-enabled with qmp_enable_command(). Only > qga/main.c does, in ga_enable_non_blacklisted(). I figure it re-enables > the commands ga_disable_non_whitelisted() disables. Gets called when > guest-fsfreeze-freeze freezes nothing[1], and when guest-fsfreeze-thaw > succeeds[2]. > > Command dispatch fails when the command is disabled, in > do_qmp_dispatch(). The proposed patch changes the error reply. > > QGA's guest-info shows whether a command is disabled > (GuestAgentCommandInfo member @enabled, set in qmp_command_info()). > > QMP's query-commands skips disabled commands, in query_commands_cb(). > Dead, as nothing ever disables QMP commands. Skipping feels like a bad > idea anyway. > > Analysis: > > There are three kinds of disabled commands: compile-time (should be > compiled out instead), permanently blacklisted with -b, temporarily > disabled while filesystems are frozen. > > There are two states: thawed (first two kinds disabled) and frozen (all > three kinds disabled). > > Command guest-fsfreeze-freeze[3] goes to state frozen or else fails. > > Command guest-fsfreeze-thaw goes to state thawed or else fails. > > guest-fsfreeze-status reports the state. > > Note that the transition to frozen (and thus the temporary command > disablement) is under the control of the QGA client. There is no > TOCTTOU between guest-info telling you which commands are disabled and > executing the next command. My point is: the client can figure out > whether a command is disabled before executing it. Alright then, I'll respin with CommandNotFound. Both work for libvirt. Michal
diff --git a/include/qapi/error.h b/include/qapi/error.h index 3f95141a01..7116b86a92 100644 --- a/include/qapi/error.h +++ b/include/qapi/error.h @@ -129,6 +129,7 @@ typedef enum ErrorClass { ERROR_CLASS_GENERIC_ERROR = QAPI_ERROR_CLASS_GENERICERROR, ERROR_CLASS_COMMAND_NOT_FOUND = QAPI_ERROR_CLASS_COMMANDNOTFOUND, + ERROR_CLASS_COMMAND_DISABLED = QAPI_ERROR_CLASS_COMMANDDISABLED, ERROR_CLASS_DEVICE_NOT_ACTIVE = QAPI_ERROR_CLASS_DEVICENOTACTIVE, ERROR_CLASS_DEVICE_NOT_FOUND = QAPI_ERROR_CLASS_DEVICENOTFOUND, ERROR_CLASS_KVM_MISSING_CAP = QAPI_ERROR_CLASS_KVMMISSINGCAP, diff --git a/qapi/error.json b/qapi/error.json index 3fad08f506..334d481399 100644 --- a/qapi/error.json +++ b/qapi/error.json @@ -14,6 +14,8 @@ # # @CommandNotFound: the requested command has not been found # +# @CommandDisabled: the requested command has been disabled +# # @DeviceNotActive: a device has failed to be become active # # @DeviceNotFound: the requested device has not been found @@ -25,5 +27,5 @@ ## { 'enum': 'QapiErrorClass', # Keep this in sync with ErrorClass in error.h - 'data': [ 'GenericError', 'CommandNotFound', + 'data': [ 'GenericError', 'CommandNotFound', 'CommandDisabled', 'DeviceNotActive', 'DeviceNotFound', 'KVMMissingCap' ] } diff --git a/qapi/qmp-dispatch.c b/qapi/qmp-dispatch.c index 3037d353a4..913b3363cb 100644 --- a/qapi/qmp-dispatch.c +++ b/qapi/qmp-dispatch.c @@ -104,8 +104,9 @@ static QObject *do_qmp_dispatch(QmpCommandList *cmds, QObject *request, return NULL; } if (!cmd->enabled) { - error_setg(errp, "The command %s has been disabled for this instance", - command); + error_set(errp, ERROR_CLASS_COMMAND_DISABLED, + "The command %s has been disabled for this instance", + command); return NULL; } if (oob && !(cmd->options & QCO_ALLOW_OOB)) {
If there was a disabled command, then qemu-ga used to report CommandDisabled error class (among with human readable description). This changed in v1.2.0-rc0~28^2~16 in favor of GenericError class. While the change might work for other classes, this one should not have been dropped because it helps callers distinguish the root cause of the error. A bit of background: up until very recently libvirt used qemu-ga in all or nothing way. It didn't care why a qemu-ga command failed. But very recently a new API was introduced which implements 'best effort' approach (in some cases) and thus libvirt must differentiate between: {CommandNotFound, CommandDisabled} and some generic error. While the former classes mean the API can issue some other commands the latter raises a red flag causing the API to fail. This reverts df1e608a01 partially. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- include/qapi/error.h | 1 + qapi/error.json | 4 +++- qapi/qmp-dispatch.c | 5 +++-- 3 files changed, 7 insertions(+), 3 deletions(-)