diff mbox series

[5/5] qom: Make container_get() strict to always walk or return container

Message ID 20241118221330.3480246-6-peterx@redhat.com (mailing list archive)
State New
Headers show
Series QOM: Enforce container_get() to operate on containers only | expand

Commit Message

Peter Xu Nov. 18, 2024, 10:13 p.m. UTC
When used incorrectly, container_get() can silently create containers even
if the caller may not intend to do so.  Add a rich document describing the
helper, as container_get() should only be used in path lookups.

Add one object_dynamic_cast() check to make sure whatever objects the
helper walks will be a container object (including the one to be returned).
It is a programming error otherwise, hence assert that.

It may make container_get() tiny slower than before, but the hope is the
change is neglictable, as object_class_dynamic_cast() has a fast path just
for similar leaf use case.

Link: https://lore.kernel.org/r/87pln6ds8q.fsf@pond.sub.org
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 qom/container.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

Comments

Peter Xu Nov. 18, 2024, 11:06 p.m. UTC | #1
On Mon, Nov 18, 2024 at 05:13:30PM -0500, Peter Xu wrote:
> When used incorrectly, container_get() can silently create containers even
> if the caller may not intend to do so.  Add a rich document describing the
> helper, as container_get() should only be used in path lookups.
> 
> Add one object_dynamic_cast() check to make sure whatever objects the
> helper walks will be a container object (including the one to be returned).
> It is a programming error otherwise, hence assert that.
> 
> It may make container_get() tiny slower than before, but the hope is the
> change is neglictable, as object_class_dynamic_cast() has a fast path just
> for similar leaf use case.

Just a heads up: out of curiosity, I tried to see whether the fast path hit
that I mentioned here (mostly, commit 793c96b54032 of Paolo's), and it
didn't..

It's fundamentally because all TypeImpl was allocated dynamically from
heap, including its type->name.  While typename should normally be const
strings that locates on RODATA sections, hence they should mostly never
hit when compare with pointers..

I was thinking whether we could add a strcmp() there too for the fast path,
but then I noticed that QEMU could have some pretty long type->name... so
that strcmp() idea may not be good if that's the case. E.g.:

  virtio-net-pci-non-transitional::conventional-pci-device

Which has 55 chars..

I don't have good idea to make the fast path hit here, so I'll at least
remove this paragraph if I'm going to repost.. I hope it's not a huge deal
to still do the sanity check here, as the container type is so special and
small, so that check should be fast regardless.

> 
> Link: https://lore.kernel.org/r/87pln6ds8q.fsf@pond.sub.org
> Suggested-by: Markus Armbruster <armbru@redhat.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  qom/container.c | 21 +++++++++++++++++++++
>  1 file changed, 21 insertions(+)
> 
> diff --git a/qom/container.c b/qom/container.c
> index cfec92a944..ff6e35f837 100644
> --- a/qom/container.c
> +++ b/qom/container.c
> @@ -24,6 +24,20 @@ static void container_register_types(void)
>      type_register_static(&container_info);
>  }
>  
> +/**
> + * container_get(): Get the container object under specific path
> + *
> + * @root: The root path object to start walking from.  When starting from
> + *        root, one can pass in object_get_root().
> + * @path: The sub-path to lookup, must be an non-empty string starts with "/".
> + *
> + * Returns: The container object specified by @path.
> + *
> + * NOTE: the function may impplicitly create internal containers when the
> + * whole path is not yet created.  It's the caller's responsibility to make
> + * sure the path specified is always used as object containers, rather than
> + * any other type of objects.
> + */
>  Object *container_get(Object *root, const char *path)
>  {
>      Object *obj, *child;
> @@ -31,6 +45,7 @@ Object *container_get(Object *root, const char *path)
>      int i;
>  
>      parts = g_strsplit(path, "/", 0);
> +    /* "path" must be an non-empty string starting with "/" */
>      assert(parts != NULL && parts[0] != NULL && !parts[0][0]);
>      obj = root;
>  
> @@ -40,6 +55,12 @@ Object *container_get(Object *root, const char *path)
>              child = object_new(TYPE_CONTAINER);
>              object_property_add_child(obj, parts[i], child);
>              object_unref(child);
> +        } else {
> +            /*
> +             * Each object within the path must be a container object
> +             * itself, including the object to be returned.
> +             */
> +            assert(object_dynamic_cast(child, TYPE_CONTAINER));
>          }
>      }
>  
> -- 
> 2.45.0
>
Paolo Bonzini Nov. 19, 2024, 8:09 a.m. UTC | #2
Il mar 19 nov 2024, 00:06 Peter Xu <peterx@redhat.com> ha scritto:

> On Mon, Nov 18, 2024 at 05:13:30PM -0500, Peter Xu wrote:
> > When used incorrectly, container_get() can silently create containers
> even
> > if the caller may not intend to do so.  Add a rich document describing
> the
> > helper, as container_get() should only be used in path lookups.
> >
> > Add one object_dynamic_cast() check to make sure whatever objects the
> > helper walks will be a container object (including the one to be
> returned).
> > It is a programming error otherwise, hence assert that.
> >
> > It may make container_get() tiny slower than before, but the hope is the
> > change is neglictable, as object_class_dynamic_cast() has a fast path
> just
> > for similar leaf use case.
>
> Just a heads up: out of curiosity, I tried to see whether the fast path hit
> that I mentioned here (mostly, commit 793c96b54032 of Paolo's), and it
> didn't..
>
> It's fundamentally because all TypeImpl was allocated dynamically from
> heap, including its type->name.


Ah, that was supposed to be the difference between type_register() and
type_register_static().

Perhaps type->name could be allocated with g_intern_string()? And then if
object_dynamic_cast() is changed into a macro, with something like

#define qemu_cache_interned_string(s) \
  (__builtin_constant_p(s) \
   ? ({ static const char *interned; \
        interned = interned ?: g_intern_static_string(s); }) \
   : g_intern_string(s))

as the third parameter. This allows object_dynamic_cast() to use a simple
pointer equality for type name comparison, and the same can be applied to
object_class_dynamic_cast().

Whatever we do, we should do it before Rust code starts using
object_dynamic_cast!

Paolo
Daniel P. Berrangé Nov. 19, 2024, 10:03 a.m. UTC | #3
On Mon, Nov 18, 2024 at 05:13:30PM -0500, Peter Xu wrote:
> When used incorrectly, container_get() can silently create containers even
> if the caller may not intend to do so.  Add a rich document describing the
> helper, as container_get() should only be used in path lookups.
> 
> Add one object_dynamic_cast() check to make sure whatever objects the
> helper walks will be a container object (including the one to be returned).
> It is a programming error otherwise, hence assert that.
> 
> It may make container_get() tiny slower than before, but the hope is the
> change is neglictable, as object_class_dynamic_cast() has a fast path just
> for similar leaf use case.
> 
> Link: https://lore.kernel.org/r/87pln6ds8q.fsf@pond.sub.org
> Suggested-by: Markus Armbruster <armbru@redhat.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  qom/container.c | 21 +++++++++++++++++++++
>  1 file changed, 21 insertions(+)
> 
> diff --git a/qom/container.c b/qom/container.c
> index cfec92a944..ff6e35f837 100644
> --- a/qom/container.c
> +++ b/qom/container.c
> @@ -24,6 +24,20 @@ static void container_register_types(void)
>      type_register_static(&container_info);
>  }
>  
> +/**
> + * container_get(): Get the container object under specific path
> + *
> + * @root: The root path object to start walking from.  When starting from
> + *        root, one can pass in object_get_root().
> + * @path: The sub-path to lookup, must be an non-empty string starts with "/".
> + *
> + * Returns: The container object specified by @path.
> + *
> + * NOTE: the function may impplicitly create internal containers when the
> + * whole path is not yet created.  It's the caller's responsibility to make
> + * sure the path specified is always used as object containers, rather than
> + * any other type of objects.
> + */

The docs are a welcome addition, but at the same time the docs won't get
read most of the time.

With this in mind, IMHO, it is a conceptually *terrible* design for us to
have a method called "get" which magically *creates* stuff as a side-effect
of its calling. We'd be well served by fixing that design problem.

If I look in the code at what calls we have to container_get, and more
specifically what "path" values we pass, there are not actually that many:

  /objects
  /chardevs
  /unattached
  /machine
  /peripheral
  /peripheral-anon
  /backend
  /dr-connector
  

Ignoring the last one, those other 7 containers are things we expect
to exist in *every* system emulator.

Second, every single one of them is a single level deep. IOW, the for()
loop in container_get is effectively pointless.

We can fix this by having a single method:

 void container_create_builtin(Object *root)

which creates the 7 built-in standard containers we expect
everywhere, with open coded object_new + add_child calls. 

Then all current users of container_get() can switch over
to object_resolve_path, and container_get() can be eliminated.

The 'dr-connector' creation can just be open-coded using
object_new() in the spapr code.

>  Object *container_get(Object *root, const char *path)
>  {
>      Object *obj, *child;
> @@ -31,6 +45,7 @@ Object *container_get(Object *root, const char *path)
>      int i;
>  
>      parts = g_strsplit(path, "/", 0);
> +    /* "path" must be an non-empty string starting with "/" */
>      assert(parts != NULL && parts[0] != NULL && !parts[0][0]);
>      obj = root;
>  
> @@ -40,6 +55,12 @@ Object *container_get(Object *root, const char *path)
>              child = object_new(TYPE_CONTAINER);
>              object_property_add_child(obj, parts[i], child);
>              object_unref(child);
> +        } else {
> +            /*
> +             * Each object within the path must be a container object
> +             * itself, including the object to be returned.
> +             */
> +            assert(object_dynamic_cast(child, TYPE_CONTAINER));
>          }
>      }
>  
> -- 
> 2.45.0
> 

With regards,
Daniel
Peter Xu Nov. 19, 2024, 8:06 p.m. UTC | #4
On Tue, Nov 19, 2024 at 09:09:16AM +0100, Paolo Bonzini wrote:
> Il mar 19 nov 2024, 00:06 Peter Xu <peterx@redhat.com> ha scritto:
> 
> > On Mon, Nov 18, 2024 at 05:13:30PM -0500, Peter Xu wrote:
> > > When used incorrectly, container_get() can silently create containers
> > even
> > > if the caller may not intend to do so.  Add a rich document describing
> > the
> > > helper, as container_get() should only be used in path lookups.
> > >
> > > Add one object_dynamic_cast() check to make sure whatever objects the
> > > helper walks will be a container object (including the one to be
> > returned).
> > > It is a programming error otherwise, hence assert that.
> > >
> > > It may make container_get() tiny slower than before, but the hope is the
> > > change is neglictable, as object_class_dynamic_cast() has a fast path
> > just
> > > for similar leaf use case.
> >
> > Just a heads up: out of curiosity, I tried to see whether the fast path hit
> > that I mentioned here (mostly, commit 793c96b54032 of Paolo's), and it
> > didn't..
> >
> > It's fundamentally because all TypeImpl was allocated dynamically from
> > heap, including its type->name.
> 
> 
> Ah, that was supposed to be the difference between type_register() and
> type_register_static().

Ah... looks like they're the same now? As type_register_static() looks like
a wrapper of type_register().

I gave it a shot on booting a Linux guest with some pretty generic devices,
and see how much the pointer check hit.  Until I got the root login prompt,
I got 8 hits out of 35488.  So it's indeed hard to yet hit.. at least with
the current code base. :(

> 
> Perhaps type->name could be allocated with g_intern_string()? And then if
> object_dynamic_cast() is changed into a macro, with something like
> 
> #define qemu_cache_interned_string(s) \
>   (__builtin_constant_p(s) \
>    ? ({ static const char *interned; \
>         interned = interned ?: g_intern_static_string(s); }) \
>    : g_intern_string(s))
> 
> as the third parameter. This allows object_dynamic_cast() to use a simple
> pointer equality for type name comparison, and the same can be applied to
> object_class_dynamic_cast().

Interesting to know this facility!  Though, IIUC this may:

  - For builtin-consts, it grows 8 bytes for each call sites on the binary
    generated, even if (I think...) most of the sites are slow paths, and
    there're plenty of such calls..

  - For non-builtin strings, g_intern_string() will add one more hash
    operation for the whole string (and per discussed previously, looks
    like the string can be not always short..).

So I'm not 100% sure yet if this is what we want.

Do we have known places that we care a lot on object[_class]_dynamic_cast()
performance?  I can give it some measurement if there is, otherwise I'm
guessing whatever changes could fall into the noise.. then we can also
leave that for later, knowing that the fast path will hardly hit for now,
but that shouldn't be a major issue either, I assume.

> 
> Whatever we do, we should do it before Rust code starts using
> object_dynamic_cast!

Thanks!
Peter Xu Nov. 19, 2024, 8:25 p.m. UTC | #5
On Tue, Nov 19, 2024 at 10:03:22AM +0000, Daniel P. Berrangé wrote:
> The docs are a welcome addition, but at the same time the docs won't get
> read most of the time.
> 
> With this in mind, IMHO, it is a conceptually *terrible* design for us to
> have a method called "get" which magically *creates* stuff as a side-effect
> of its calling. We'd be well served by fixing that design problem.
> 
> If I look in the code at what calls we have to container_get, and more
> specifically what "path" values we pass, there are not actually that many:
> 
>   /objects
>   /chardevs
>   /unattached
>   /machine
>   /peripheral
>   /peripheral-anon
>   /backend
>   /dr-connector
>   
> 
> Ignoring the last one, those other 7 containers are things we expect
> to exist in *every* system emulator.
> 
> Second, every single one of them is a single level deep. IOW, the for()
> loop in container_get is effectively pointless.
> 
> We can fix this by having a single method:
> 
>  void container_create_builtin(Object *root)
> 
> which creates the 7 built-in standard containers we expect
> everywhere, with open coded object_new + add_child calls. 
> 
> Then all current users of container_get() can switch over
> to object_resolve_path, and container_get() can be eliminated.
> 
> The 'dr-connector' creation can just be open-coded using
> object_new() in the spapr code.

Yes I think this could make sense, also after I noticed that the assert I
added may not always work..  Please ignore this series then, I'll prepare
something else soon.
Paolo Bonzini Nov. 19, 2024, 8:30 p.m. UTC | #6
Il mar 19 nov 2024, 21:07 Peter Xu <peterx@redhat.com> ha scritto:

> > Ah, that was supposed to be the difference between type_register() and
> > type_register_static().
>
> Ah... looks like they're the same now? As type_register_static() looks like
> a wrapper of type_register().
>

And pretty much have always been... Zhao looked into it recently.

> Perhaps type->name could be allocated with g_intern_string()? And then if
> > object_dynamic_cast() is changed into a macro, with something like
> >
> > #define qemu_cache_interned_string(s) \
> >   (__builtin_constant_p(s) \
> >    ? ({ static const char *interned; \
> >         interned = interned ?: g_intern_static_string(s); }) \
> >    : g_intern_string(s))
> >
> > as the third parameter. This allows object_dynamic_cast() to use a simple
> > pointer equality for type name comparison, and the same can be applied to
> > object_class_dynamic_cast().
>
> Interesting to know this facility!  Though, IIUC this may:
>
>   - For builtin-consts, it grows 8 bytes for each call sites on the binary
>     generated, even if (I think...) most of the sites are slow paths, and
>     there're plenty of such calls..
>

Right, but all the fast paths will be here. Only a few will be of course.

  - For non-builtin strings, g_intern_string() will add one more hash
>     operation for the whole string (and per discussed previously, looks
>     like the string can be not always short..).
>

Yes, but non-const strings should be *really* rare and not fast paths.

So I'm not 100% sure yet if this is what we want.
>
> Do we have known places that we care a lot on object[_class]_dynamic_cast()
> performance?


The easiest way to check is probably to print the type of every successful
object_dynamic_cast and object_class_dynamic_cast. I suspect the result
will be virtio-blk-device and/or scsi-hd, but maybe those already do an
unsafe cast (pointer type cast) instead of object_dynamic_cast.

I can give it some measurement if there is, otherwise I'm
> guessing whatever changes could fall into the noise.


Yes, probably. At most you can identify if there any heavy places out of
the 34000 calls, and see if they can use an unsafe cast.

Paolo
Peter Xu Nov. 19, 2024, 9:43 p.m. UTC | #7
On Tue, Nov 19, 2024 at 09:30:09PM +0100, Paolo Bonzini wrote:
> >
> > Do we have known places that we care a lot on object[_class]_dynamic_cast()
> > performance?
> 
> The easiest way to check is probably to print the type of every successful
> object_dynamic_cast and object_class_dynamic_cast. I suspect the result
> will be virtio-blk-device and/or scsi-hd, but maybe those already do an
> unsafe cast (pointer type cast) instead of object_dynamic_cast.

Yes, it sounds more reasonable to me to optimize specific call sites so far
rather than provides something generic..  Though it could still be a
generic API so that devices can opt-in.  Maybe still not as fast as an
unsafe cast, though.. I think I'll leave that to block experts when it may
needs some good measurements.

> 
> I can give it some measurement if there is, otherwise I'm
> > guessing whatever changes could fall into the noise.
> 
> 
> Yes, probably. At most you can identify if there any heavy places out of
> the 34000 calls, and see if they can use an unsafe cast.

I can still trivially do this.

I traced qemu using bpf and interestingly in my case close to half (over
10000+) of the calls are about ahci_irq_lower() from different higher level
stack (yeah I used IDE in my setup.. with a split irqchi..), where it has:

    PCIDevice *pci_dev = (PCIDevice *) object_dynamic_cast(OBJECT(dev_state),
                                                           TYPE_PCI_DEVICE);

So IIUC that can be open to a unsafe cast too, but considering IDE is ODD
FIXES stage, I'm not sure if I should send a patch at all.  However I
copied John regardless.

Thanks,
Paolo Bonzini Nov. 20, 2024, 11:45 a.m. UTC | #8
Il mar 19 nov 2024, 22:43 Peter Xu <peterx@redhat.com> ha scritto:

> > The easiest way to check is probably to print the type of every
> successful
> > object_dynamic_cast and object_class_dynamic_cast. I suspect the result
> > will be virtio-blk-device and/or scsi-hd, but maybe those already do an
> > unsafe cast (pointer type cast) instead of object_dynamic_cast.
>
> Yes, it sounds more reasonable to me to optimize specific call sites so far
> rather than provides something generic.

Though it could still be a
> generic API so that devices can opt-in.


One of the things that I am excited about for Rust is checking at compile
time whether a cast is to a superclass, which makes it safe automatically.

> I can give it some measurement if there is, otherwise I'm
> > > guessing whatever changes could fall into the noise.
> >
> >
> > Yes, probably. At most you can identify if there any heavy places out of
> > the 34000 calls, and see if they can use an unsafe cast.
>
> I can still trivially do this.
>
> I traced qemu using bpf


Nice! I want to know more. :))
A

> and interestingly in my case close to half (over
> 10000+) of the calls are about ahci_irq_lower() from different higher level
> stack (yeah I used IDE in my setup.. with a split irqchi..), where it has:
>
>     PCIDevice *pci_dev = (PCIDevice *)
> object_dynamic_cast(OBJECT(dev_state),
>
>  TYPE_PCI_DEVICE);
>
> So IIUC that can be open to a unsafe cast too


Hmm no it can't because there's also sysbus AHCI. The fix would be to add
an AHCIClass and make irq toggling into a method there

but considering IDE is ODD FIXES stage, I'm not sure if I should send a
> patch at all.  However I copied John regardless.
>

Well, MAINTAINERS only says the kind of work that the maintainer is doing,
you can always do more. However it seems like not a small amount, so maybe
adding a comment is enough if somebody else wants to do it?

Paolo


> Thanks,
>
> --
> Peter Xu
>
>
Peter Xu Nov. 20, 2024, 4:24 p.m. UTC | #9
On Wed, Nov 20, 2024 at 12:45:19PM +0100, Paolo Bonzini wrote:
> Il mar 19 nov 2024, 22:43 Peter Xu <peterx@redhat.com> ha scritto:
> 
> > > The easiest way to check is probably to print the type of every
> > successful
> > > object_dynamic_cast and object_class_dynamic_cast. I suspect the result
> > > will be virtio-blk-device and/or scsi-hd, but maybe those already do an
> > > unsafe cast (pointer type cast) instead of object_dynamic_cast.
> >
> > Yes, it sounds more reasonable to me to optimize specific call sites so far
> > rather than provides something generic.
> 
> Though it could still be a
> > generic API so that devices can opt-in.
> 
> 
> One of the things that I am excited about for Rust is checking at compile
> time whether a cast is to a superclass, which makes it safe automatically.

I see.  However looks like it doesn't easily apply to the ahci example
below, where it could conditionally fail the cast (and where I got it
wrong..)?

> 
> > I can give it some measurement if there is, otherwise I'm
> > > > guessing whatever changes could fall into the noise.
> > >
> > >
> > > Yes, probably. At most you can identify if there any heavy places out of
> > > the 34000 calls, and see if they can use an unsafe cast.
> >
> > I can still trivially do this.
> >
> > I traced qemu using bpf
> 
> 
> Nice! I want to know more. :))

I also only learned it yesterday, where I only used to use k*probes
previously. :-) That's:

$ cat qemu.bpf
uprobe:/home/peterx/git/qemu/bin/qemu-system-x86_64:object_class_dynamic_cast
{
        @out[ustack()]++;
}
$ sudo bpftrace --usdt-file-activation ./qemu.bpf

> 
> > and interestingly in my case close to half (over
> > 10000+) of the calls are about ahci_irq_lower() from different higher level
> > stack (yeah I used IDE in my setup.. with a split irqchi..), where it has:
> >
> >     PCIDevice *pci_dev = (PCIDevice *)
> > object_dynamic_cast(OBJECT(dev_state),
> >
> >  TYPE_PCI_DEVICE);
> >
> > So IIUC that can be open to a unsafe cast too
> 
> 
> Hmm no it can't because there's also sysbus AHCI. The fix would be to add
> an AHCIClass and make irq toggling into a method there

Yep, I overlooked the lines of code later.. :(

> 
> but considering IDE is ODD FIXES stage, I'm not sure if I should send a
> > patch at all.  However I copied John regardless.
> >
> 
> Well, MAINTAINERS only says the kind of work that the maintainer is doing,
> you can always do more. However it seems like not a small amount, so maybe
> adding a comment is enough if somebody else wants to do it?

Can do.
diff mbox series

Patch

diff --git a/qom/container.c b/qom/container.c
index cfec92a944..ff6e35f837 100644
--- a/qom/container.c
+++ b/qom/container.c
@@ -24,6 +24,20 @@  static void container_register_types(void)
     type_register_static(&container_info);
 }
 
+/**
+ * container_get(): Get the container object under specific path
+ *
+ * @root: The root path object to start walking from.  When starting from
+ *        root, one can pass in object_get_root().
+ * @path: The sub-path to lookup, must be an non-empty string starts with "/".
+ *
+ * Returns: The container object specified by @path.
+ *
+ * NOTE: the function may impplicitly create internal containers when the
+ * whole path is not yet created.  It's the caller's responsibility to make
+ * sure the path specified is always used as object containers, rather than
+ * any other type of objects.
+ */
 Object *container_get(Object *root, const char *path)
 {
     Object *obj, *child;
@@ -31,6 +45,7 @@  Object *container_get(Object *root, const char *path)
     int i;
 
     parts = g_strsplit(path, "/", 0);
+    /* "path" must be an non-empty string starting with "/" */
     assert(parts != NULL && parts[0] != NULL && !parts[0][0]);
     obj = root;
 
@@ -40,6 +55,12 @@  Object *container_get(Object *root, const char *path)
             child = object_new(TYPE_CONTAINER);
             object_property_add_child(obj, parts[i], child);
             object_unref(child);
+        } else {
+            /*
+             * Each object within the path must be a container object
+             * itself, including the object to be returned.
+             */
+            assert(object_dynamic_cast(child, TYPE_CONTAINER));
         }
     }