diff mbox series

machine: do not crash if default RAM backend name has been stollen

Message ID 20230522131717.3780533-1-imammedo@redhat.com (mailing list archive)
State New, archived
Headers show
Series machine: do not crash if default RAM backend name has been stollen | expand

Commit Message

Igor Mammedov May 22, 2023, 1:17 p.m. UTC
QEMU aborts when default RAM backend should be used (i.e. no
explicit '-machine memory-backend=' specified) but user
has created an object which 'id' equals to default RAM backend
name used by board.

 $QEMU -machine pc \
       -object memory-backend-ram,id=pc.ram,size=4294967296

 Actual results:
 QEMU 7.2.0 monitor - type 'help' for more information
 (qemu) Unexpected error in object_property_try_add() at ../qom/object.c:1239:
 qemu-kvm: attempt to add duplicate property 'pc.ram' to object (type 'container')
 Aborted (core dumped)

Instead of abort, check for the conflicting 'id' and exit with
an error, suggesting how to remedy the issue.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
CC: thuth@redhat.com
---
 hw/core/machine.c | 8 ++++++++
 1 file changed, 8 insertions(+)

Comments

Philippe Mathieu-Daudé May 22, 2023, 1:33 p.m. UTC | #1
On 22/5/23 15:17, Igor Mammedov wrote:
> QEMU aborts when default RAM backend should be used (i.e. no
> explicit '-machine memory-backend=' specified) but user
> has created an object which 'id' equals to default RAM backend
> name used by board.
> 
>   $QEMU -machine pc \
>         -object memory-backend-ram,id=pc.ram,size=4294967296
> 
>   Actual results:
>   QEMU 7.2.0 monitor - type 'help' for more information
>   (qemu) Unexpected error in object_property_try_add() at ../qom/object.c:1239:
>   qemu-kvm: attempt to add duplicate property 'pc.ram' to object (type 'container')
>   Aborted (core dumped)
> 
> Instead of abort, check for the conflicting 'id' and exit with
> an error, suggesting how to remedy the issue.
> 
> Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> CC: thuth@redhat.com
> ---
>   hw/core/machine.c | 8 ++++++++
>   1 file changed, 8 insertions(+)

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Shaoqin Huang May 23, 2023, 9:10 a.m. UTC | #2
With the patch, qemu exits normally instead of Aborted.

On 5/22/23 21:17, Igor Mammedov wrote:
> QEMU aborts when default RAM backend should be used (i.e. no
> explicit '-machine memory-backend=' specified) but user
> has created an object which 'id' equals to default RAM backend
> name used by board.
> 
>   $QEMU -machine pc \
>         -object memory-backend-ram,id=pc.ram,size=4294967296
> 
>   Actual results:
>   QEMU 7.2.0 monitor - type 'help' for more information
>   (qemu) Unexpected error in object_property_try_add() at ../qom/object.c:1239:
>   qemu-kvm: attempt to add duplicate property 'pc.ram' to object (type 'container')
>   Aborted (core dumped)
> 
> Instead of abort, check for the conflicting 'id' and exit with
> an error, suggesting how to remedy the issue.
> 
> Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> CC: thuth@redhat.com
Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
> ---
>   hw/core/machine.c | 8 ++++++++
>   1 file changed, 8 insertions(+)
> 
> diff --git a/hw/core/machine.c b/hw/core/machine.c
> index 07f763eb2e..1000406211 100644
> --- a/hw/core/machine.c
> +++ b/hw/core/machine.c
> @@ -1338,6 +1338,14 @@ void machine_run_board_init(MachineState *machine, const char *mem_path, Error *
>           }
>       } else if (machine_class->default_ram_id && machine->ram_size &&
>                  numa_uses_legacy_mem()) {
> +        if (object_property_find(object_get_objects_root(),
> +                                 machine_class->default_ram_id)) {
> +            error_setg(errp, "object name '%s' is reserved for the default"
> +                " RAM backend, it can't be used for any other purposes."
> +                " Change the object's 'id' to something else",
> +                machine_class->default_ram_id);
> +            return;
> +        }
>           if (!create_default_memdev(current_machine, mem_path, errp)) {
>               return;
>           }
Markus Armbruster May 23, 2023, 12:31 p.m. UTC | #3
Igor Mammedov <imammedo@redhat.com> writes:

> QEMU aborts when default RAM backend should be used (i.e. no
> explicit '-machine memory-backend=' specified) but user
> has created an object which 'id' equals to default RAM backend
> name used by board.
>
>  $QEMU -machine pc \
>        -object memory-backend-ram,id=pc.ram,size=4294967296
>
>  Actual results:
>  QEMU 7.2.0 monitor - type 'help' for more information
>  (qemu) Unexpected error in object_property_try_add() at ../qom/object.c:1239:
>  qemu-kvm: attempt to add duplicate property 'pc.ram' to object (type 'container')
>  Aborted (core dumped)
>
> Instead of abort, check for the conflicting 'id' and exit with
> an error, suggesting how to remedy the issue.

This is an instance of an (unfortunately common) anti-pattern.

The point of an ID is to *identify*.  To do that, IDs of the same kind
must be unique.  "Of the same kind" because we let different kinds of
objects have the same ID[*].

IDs are arbitrary strings.  The user may pick any ID, as long as it's
unique.  Unique not only among the user's IDs, but the system's, too.

Every time we add code that picks an ID, we break backward
compatibility: user configurations that use this ID no longer work.
Thus, system-picked IDs are part of the external interface.

We don't treat them as such.  They are pretty much undocumented, and
when we add new ones, we break the external interface silently.

How exactly things go wrong on a clash is detail from an interface
design point of view.  This patch changes one instance from "crash" to
"fatal error".  No objections, just pointing out we're playing whack a
mole there.

The fundamental mistake we made was not reserving IDs for the system's
own use.

The excuse I heard back then was that IDs are for the user, and the
system isn't supposed to pick any.  Well, it does.

To stop creating more moles, we need to reserve IDs for the system's
use, and let the system pick only reserved IDs going forward.

There would be two kinds of reserved IDs: 1. an easily documented,
easily checked ID pattern, e.g. "starts with <prefix>", to be used by
the system going forward, and 2. the messy zoo of system IDs we have
accumulated so far.

Thoughts?

[...]


[*] Questionable idea if you ask me, but tangential to the point I'm
trying to make in this memo.
Igor Mammedov May 23, 2023, 12:56 p.m. UTC | #4
On Tue, 23 May 2023 14:31:30 +0200
Markus Armbruster <armbru@redhat.com> wrote:

> Igor Mammedov <imammedo@redhat.com> writes:
> 
> > QEMU aborts when default RAM backend should be used (i.e. no
> > explicit '-machine memory-backend=' specified) but user
> > has created an object which 'id' equals to default RAM backend
> > name used by board.
> >
> >  $QEMU -machine pc \
> >        -object memory-backend-ram,id=pc.ram,size=4294967296
> >
> >  Actual results:
> >  QEMU 7.2.0 monitor - type 'help' for more information
> >  (qemu) Unexpected error in object_property_try_add() at ../qom/object.c:1239:
> >  qemu-kvm: attempt to add duplicate property 'pc.ram' to object (type 'container')
> >  Aborted (core dumped)
> >
> > Instead of abort, check for the conflicting 'id' and exit with
> > an error, suggesting how to remedy the issue.  
> 
> This is an instance of an (unfortunately common) anti-pattern.
> 
> The point of an ID is to *identify*.  To do that, IDs of the same kind
> must be unique.  "Of the same kind" because we let different kinds of
> objects have the same ID[*].
> 
> IDs are arbitrary strings.  The user may pick any ID, as long as it's
> unique.  Unique not only among the user's IDs, but the system's, too.
> 
> Every time we add code that picks an ID, we break backward
> compatibility: user configurations that use this ID no longer work.
> Thus, system-picked IDs are part of the external interface.

in this case, IDs are there to keep backward compatibility
(so migration won't fail) and it affects only default (legacy**)
path where user doesn't provide memory-backend explicitly
(which could be named anything that doesn't collide with other objects)

> We don't treat them as such.  They are pretty much undocumented, and
> when we add new ones, we break the external interface silently.

this ID in particular is introspect-able (a part of qmp_query_machines output)
to help mgmt pick backward compatible ID when switching to explicit
RAM backend CLI (current libvirt behaviour).

> How exactly things go wrong on a clash is detail from an interface
> design point of view.  This patch changes one instance from "crash" to
> "fatal error".  No objections, just pointing out we're playing whack a
> mole there.
> 
> The fundamental mistake we made was not reserving IDs for the system's
> own use.
> 
> The excuse I heard back then was that IDs are for the user, and the
> system isn't supposed to pick any.  Well, it does.
> 
> To stop creating more moles, we need to reserve IDs for the system's
> use, and let the system pick only reserved IDs going forward.
> 
> There would be two kinds of reserved IDs: 1. an easily documented,
> easily checked ID pattern, e.g. "starts with <prefix>", to be used by
> the system going forward, and 2. the messy zoo of system IDs we have
> accumulated so far.
> 
> Thoughts?

I'd vote for #1 only, however that isn't an option
as renaming existing internal IDs will for sure break backward compat.
So perhaps a mix of #1 (for all new internal IDs) and #2 for
legacy ones, with some centralized place to keep track of them.
 
> [...]
> 
> 
> [*] Questionable idea if you ask me, but tangential to the point I'm
> trying to make in this memo.
> 

[**] If it were up to me, I'd drop implicit RAM backend creation
and require explicit backend being provided on CLI by user
instead of making thing up for the sake of convenience.
(If there is a support in favor of this, I'll gladly post a patch)
Thomas Huth May 23, 2023, 1:06 p.m. UTC | #5
On 23/05/2023 14.31, Markus Armbruster wrote:
...
> To stop creating more moles, we need to reserve IDs for the system's
> use, and let the system pick only reserved IDs going forward.

Just something to add here: We already have a function for generating 
internal IDs, the id_generate() function in util/id.c ... our convention is 
that we use "#" as prefix for those, so for new code (which is not affected 
by migration backward compatibility problems), we should maybe take care of 
always using that prefix for internal IDs, too.

  Thomas
Thomas Huth May 23, 2023, 1:16 p.m. UTC | #6
s/stollen/stolen/ in the subject

On 22/05/2023 15.17, Igor Mammedov wrote:
> QEMU aborts when default RAM backend should be used (i.e. no
> explicit '-machine memory-backend=' specified) but user
> has created an object which 'id' equals to default RAM backend
> name used by board.
> 
>   $QEMU -machine pc \
>         -object memory-backend-ram,id=pc.ram,size=4294967296
> 
>   Actual results:
>   QEMU 7.2.0 monitor - type 'help' for more information
>   (qemu) Unexpected error in object_property_try_add() at ../qom/object.c:1239:
>   qemu-kvm: attempt to add duplicate property 'pc.ram' to object (type 'container')
>   Aborted (core dumped)
> 
> Instead of abort, check for the conflicting 'id' and exit with
> an error, suggesting how to remedy the issue.
> 

Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2207886

> Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> CC: thuth@redhat.com
> ---
>   hw/core/machine.c | 8 ++++++++
>   1 file changed, 8 insertions(+)
> 
> diff --git a/hw/core/machine.c b/hw/core/machine.c
> index 07f763eb2e..1000406211 100644
> --- a/hw/core/machine.c
> +++ b/hw/core/machine.c
> @@ -1338,6 +1338,14 @@ void machine_run_board_init(MachineState *machine, const char *mem_path, Error *
>           }
>       } else if (machine_class->default_ram_id && machine->ram_size &&
>                  numa_uses_legacy_mem()) {
> +        if (object_property_find(object_get_objects_root(),
> +                                 machine_class->default_ram_id)) {
> +            error_setg(errp, "object name '%s' is reserved for the default"
> +                " RAM backend, it can't be used for any other purposes."
> +                " Change the object's 'id' to something else",
> +                machine_class->default_ram_id);
> +            return;
> +        }
>           if (!create_default_memdev(current_machine, mem_path, errp)) {
>               return;
>           }

Works for me on s390x, too (see 
https://bugzilla.redhat.com/show_bug.cgi?id=2207886#c3 )

Tested-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Markus Armbruster May 27, 2023, 7:22 a.m. UTC | #7
Thomas Huth <thuth@redhat.com> writes:

> On 23/05/2023 14.31, Markus Armbruster wrote:
> ...
>> To stop creating more moles, we need to reserve IDs for the system's
>> use, and let the system pick only reserved IDs going forward.
>
> Just something to add here: We already have a function for generating
> internal IDs, the id_generate() function in util/id.c ... our
> convention is that we use "#" as prefix for those, so for new code
> (which is not affected by migration backward compatibility problems),
> we should maybe take care of always using that prefix for internal
> IDs, too.

Valid point.
Markus Armbruster May 27, 2023, 7:45 a.m. UTC | #8
Igor Mammedov <imammedo@redhat.com> writes:

> On Tue, 23 May 2023 14:31:30 +0200
> Markus Armbruster <armbru@redhat.com> wrote:
>
>> Igor Mammedov <imammedo@redhat.com> writes:
>> 
>> > QEMU aborts when default RAM backend should be used (i.e. no
>> > explicit '-machine memory-backend=' specified) but user
>> > has created an object which 'id' equals to default RAM backend
>> > name used by board.
>> >
>> >  $QEMU -machine pc \
>> >        -object memory-backend-ram,id=pc.ram,size=4294967296
>> >
>> >  Actual results:
>> >  QEMU 7.2.0 monitor - type 'help' for more information
>> >  (qemu) Unexpected error in object_property_try_add() at ../qom/object.c:1239:
>> >  qemu-kvm: attempt to add duplicate property 'pc.ram' to object (type 'container')
>> >  Aborted (core dumped)
>> >
>> > Instead of abort, check for the conflicting 'id' and exit with
>> > an error, suggesting how to remedy the issue.  
>> 
>> This is an instance of an (unfortunately common) anti-pattern.
>> 
>> The point of an ID is to *identify*.  To do that, IDs of the same kind
>> must be unique.  "Of the same kind" because we let different kinds of
>> objects have the same ID[*].
>> 
>> IDs are arbitrary strings.  The user may pick any ID, as long as it's
>> unique.  Unique not only among the user's IDs, but the system's, too.
>> 
>> Every time we add code that picks an ID, we break backward
>> compatibility: user configurations that use this ID no longer work.
>> Thus, system-picked IDs are part of the external interface.
>
> in this case, IDs are there to keep backward compatibility
> (so migration won't fail) and it affects only default (legacy**)
> path where user doesn't provide memory-backend explicitly
> (which could be named anything that doesn't collide with other objects)
>
>> We don't treat them as such.  They are pretty much undocumented, and
>> when we add new ones, we break the external interface silently.
>
> this ID in particular is introspect-able (a part of qmp_query_machines output)
> to help mgmt pick backward compatible ID when switching to explicit
> RAM backend CLI (current libvirt behaviour).
>
>> How exactly things go wrong on a clash is detail from an interface
>> design point of view.  This patch changes one instance from "crash" to
>> "fatal error".  No objections, just pointing out we're playing whack a
>> mole there.
>> 
>> The fundamental mistake we made was not reserving IDs for the system's
>> own use.
>> 
>> The excuse I heard back then was that IDs are for the user, and the
>> system isn't supposed to pick any.  Well, it does.
>> 
>> To stop creating more moles, we need to reserve IDs for the system's
>> use, and let the system pick only reserved IDs going forward.
>> 
>> There would be two kinds of reserved IDs: 1. an easily documented,
>> easily checked ID pattern, e.g. "starts with <prefix>", to be used by
>> the system going forward, and 2. the messy zoo of system IDs we have
>> accumulated so far.
>> 
>> Thoughts?
>
> I'd vote for #1 only, however that isn't an option
> as renaming existing internal IDs will for sure break backward compat.

Yes.  Reducing the zoo by dropping and/or renaming IDs would be slow and
painful at best.

> So perhaps a mix of #1 (for all new internal IDs) and #2 for
> legacy ones, with some centralized place to keep track of them.

Yes, this is what I had in mind.

Action items:

1. Reserve ID name space for the system's use

1.a. Document

1.b. Stop letting users use reserved IDs

     Either deprecate & warn for a while, then reject, or reject right
     away.

2. Grandfather existing system-picked IDs

   The ones we care to track down we can document as reserved.

   The others we'll have to hand-wave.

Makes sense?

>> [...]
>> 
>> 
>> [*] Questionable idea if you ask me, but tangential to the point I'm
>> trying to make in this memo.
>> 
>
> [**] If it were up to me, I'd drop implicit RAM backend creation
> and require explicit backend being provided on CLI by user
> instead of making thing up for the sake of convenience.
> (If there is a support in favor of this, I'll gladly post a patch)

I lack the expertise to advise on this.
Markus Armbruster May 27, 2023, 7:59 a.m. UTC | #9
Markus Armbruster <armbru@redhat.com> writes:

> Thomas Huth <thuth@redhat.com> writes:
>
>> On 23/05/2023 14.31, Markus Armbruster wrote:
>> ...
>>> To stop creating more moles, we need to reserve IDs for the system's
>>> use, and let the system pick only reserved IDs going forward.
>>
>> Just something to add here: We already have a function for generating
>> internal IDs, the id_generate() function in util/id.c ... our

id_generate() generates IDs of the form #<subsystem><number>, where
<number> counts up.  Suitable for IDs that are not part of the stable
interface.

When a system-picked ID needs to be part of the stable interface, we
pick it in some other way.

>> convention is that we use "#" as prefix for those, so for new code
>> (which is not affected by migration backward compatibility problems),
>> we should maybe take care of always using that prefix for internal
>> IDs, too.
>
> Valid point.

I propose to move towards the QAPI naming rules for user-picked IDs:
must begin with a letter, and contain only ASCII letters, digits,
hyphen, and underscore.
Igor Mammedov June 9, 2023, 2:06 p.m. UTC | #10
On Mon, 22 May 2023 15:17:17 +0200
Igor Mammedov <imammedo@redhat.com> wrote:

Paolo,
can you pick it up?

> QEMU aborts when default RAM backend should be used (i.e. no
> explicit '-machine memory-backend=' specified) but user
> has created an object which 'id' equals to default RAM backend
> name used by board.
> 
>  $QEMU -machine pc \
>        -object memory-backend-ram,id=pc.ram,size=4294967296
> 
>  Actual results:
>  QEMU 7.2.0 monitor - type 'help' for more information
>  (qemu) Unexpected error in object_property_try_add() at ../qom/object.c:1239:
>  qemu-kvm: attempt to add duplicate property 'pc.ram' to object (type 'container')
>  Aborted (core dumped)
> 
> Instead of abort, check for the conflicting 'id' and exit with
> an error, suggesting how to remedy the issue.
> 
> Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> CC: thuth@redhat.com
> ---
>  hw/core/machine.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/hw/core/machine.c b/hw/core/machine.c
> index 07f763eb2e..1000406211 100644
> --- a/hw/core/machine.c
> +++ b/hw/core/machine.c
> @@ -1338,6 +1338,14 @@ void machine_run_board_init(MachineState *machine, const char *mem_path, Error *
>          }
>      } else if (machine_class->default_ram_id && machine->ram_size &&
>                 numa_uses_legacy_mem()) {
> +        if (object_property_find(object_get_objects_root(),
> +                                 machine_class->default_ram_id)) {
> +            error_setg(errp, "object name '%s' is reserved for the default"
> +                " RAM backend, it can't be used for any other purposes."
> +                " Change the object's 'id' to something else",
> +                machine_class->default_ram_id);
> +            return;
> +        }
>          if (!create_default_memdev(current_machine, mem_path, errp)) {
>              return;
>          }
Thomas Huth June 10, 2023, 8:47 p.m. UTC | #11
On 09/06/2023 16.06, Igor Mammedov wrote:
> On Mon, 22 May 2023 15:17:17 +0200
> Igor Mammedov <imammedo@redhat.com> wrote:
> 
> Paolo,
> can you pick it up?

It's merged already (commit a37531f2381c4e294e48b14170894741283)

  Cheers,
   Thomas
diff mbox series

Patch

diff --git a/hw/core/machine.c b/hw/core/machine.c
index 07f763eb2e..1000406211 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -1338,6 +1338,14 @@  void machine_run_board_init(MachineState *machine, const char *mem_path, Error *
         }
     } else if (machine_class->default_ram_id && machine->ram_size &&
                numa_uses_legacy_mem()) {
+        if (object_property_find(object_get_objects_root(),
+                                 machine_class->default_ram_id)) {
+            error_setg(errp, "object name '%s' is reserved for the default"
+                " RAM backend, it can't be used for any other purposes."
+                " Change the object's 'id' to something else",
+                machine_class->default_ram_id);
+            return;
+        }
         if (!create_default_memdev(current_machine, mem_path, errp)) {
             return;
         }