Message ID | 20241118221330.3480246-6-peterx@redhat.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | QOM: Enforce container_get() to operate on containers only | expand |
On Mon, Nov 18, 2024 at 05:13:30PM -0500, Peter Xu wrote: > When used incorrectly, container_get() can silently create containers even > if the caller may not intend to do so. Add a rich document describing the > helper, as container_get() should only be used in path lookups. > > Add one object_dynamic_cast() check to make sure whatever objects the > helper walks will be a container object (including the one to be returned). > It is a programming error otherwise, hence assert that. > > It may make container_get() tiny slower than before, but the hope is the > change is neglictable, as object_class_dynamic_cast() has a fast path just > for similar leaf use case. Just a heads up: out of curiosity, I tried to see whether the fast path hit that I mentioned here (mostly, commit 793c96b54032 of Paolo's), and it didn't.. It's fundamentally because all TypeImpl was allocated dynamically from heap, including its type->name. While typename should normally be const strings that locates on RODATA sections, hence they should mostly never hit when compare with pointers.. I was thinking whether we could add a strcmp() there too for the fast path, but then I noticed that QEMU could have some pretty long type->name... so that strcmp() idea may not be good if that's the case. E.g.: virtio-net-pci-non-transitional::conventional-pci-device Which has 55 chars.. I don't have good idea to make the fast path hit here, so I'll at least remove this paragraph if I'm going to repost.. I hope it's not a huge deal to still do the sanity check here, as the container type is so special and small, so that check should be fast regardless. > > Link: https://lore.kernel.org/r/87pln6ds8q.fsf@pond.sub.org > Suggested-by: Markus Armbruster <armbru@redhat.com> > Signed-off-by: Peter Xu <peterx@redhat.com> > --- > qom/container.c | 21 +++++++++++++++++++++ > 1 file changed, 21 insertions(+) > > diff --git a/qom/container.c b/qom/container.c > index cfec92a944..ff6e35f837 100644 > --- a/qom/container.c > +++ b/qom/container.c > @@ -24,6 +24,20 @@ static void container_register_types(void) > type_register_static(&container_info); > } > > +/** > + * container_get(): Get the container object under specific path > + * > + * @root: The root path object to start walking from. When starting from > + * root, one can pass in object_get_root(). > + * @path: The sub-path to lookup, must be an non-empty string starts with "/". > + * > + * Returns: The container object specified by @path. > + * > + * NOTE: the function may impplicitly create internal containers when the > + * whole path is not yet created. It's the caller's responsibility to make > + * sure the path specified is always used as object containers, rather than > + * any other type of objects. > + */ > Object *container_get(Object *root, const char *path) > { > Object *obj, *child; > @@ -31,6 +45,7 @@ Object *container_get(Object *root, const char *path) > int i; > > parts = g_strsplit(path, "/", 0); > + /* "path" must be an non-empty string starting with "/" */ > assert(parts != NULL && parts[0] != NULL && !parts[0][0]); > obj = root; > > @@ -40,6 +55,12 @@ Object *container_get(Object *root, const char *path) > child = object_new(TYPE_CONTAINER); > object_property_add_child(obj, parts[i], child); > object_unref(child); > + } else { > + /* > + * Each object within the path must be a container object > + * itself, including the object to be returned. > + */ > + assert(object_dynamic_cast(child, TYPE_CONTAINER)); > } > } > > -- > 2.45.0 >
Il mar 19 nov 2024, 00:06 Peter Xu <peterx@redhat.com> ha scritto: > On Mon, Nov 18, 2024 at 05:13:30PM -0500, Peter Xu wrote: > > When used incorrectly, container_get() can silently create containers > even > > if the caller may not intend to do so. Add a rich document describing > the > > helper, as container_get() should only be used in path lookups. > > > > Add one object_dynamic_cast() check to make sure whatever objects the > > helper walks will be a container object (including the one to be > returned). > > It is a programming error otherwise, hence assert that. > > > > It may make container_get() tiny slower than before, but the hope is the > > change is neglictable, as object_class_dynamic_cast() has a fast path > just > > for similar leaf use case. > > Just a heads up: out of curiosity, I tried to see whether the fast path hit > that I mentioned here (mostly, commit 793c96b54032 of Paolo's), and it > didn't.. > > It's fundamentally because all TypeImpl was allocated dynamically from > heap, including its type->name. Ah, that was supposed to be the difference between type_register() and type_register_static(). Perhaps type->name could be allocated with g_intern_string()? And then if object_dynamic_cast() is changed into a macro, with something like #define qemu_cache_interned_string(s) \ (__builtin_constant_p(s) \ ? ({ static const char *interned; \ interned = interned ?: g_intern_static_string(s); }) \ : g_intern_string(s)) as the third parameter. This allows object_dynamic_cast() to use a simple pointer equality for type name comparison, and the same can be applied to object_class_dynamic_cast(). Whatever we do, we should do it before Rust code starts using object_dynamic_cast! Paolo
On Mon, Nov 18, 2024 at 05:13:30PM -0500, Peter Xu wrote: > When used incorrectly, container_get() can silently create containers even > if the caller may not intend to do so. Add a rich document describing the > helper, as container_get() should only be used in path lookups. > > Add one object_dynamic_cast() check to make sure whatever objects the > helper walks will be a container object (including the one to be returned). > It is a programming error otherwise, hence assert that. > > It may make container_get() tiny slower than before, but the hope is the > change is neglictable, as object_class_dynamic_cast() has a fast path just > for similar leaf use case. > > Link: https://lore.kernel.org/r/87pln6ds8q.fsf@pond.sub.org > Suggested-by: Markus Armbruster <armbru@redhat.com> > Signed-off-by: Peter Xu <peterx@redhat.com> > --- > qom/container.c | 21 +++++++++++++++++++++ > 1 file changed, 21 insertions(+) > > diff --git a/qom/container.c b/qom/container.c > index cfec92a944..ff6e35f837 100644 > --- a/qom/container.c > +++ b/qom/container.c > @@ -24,6 +24,20 @@ static void container_register_types(void) > type_register_static(&container_info); > } > > +/** > + * container_get(): Get the container object under specific path > + * > + * @root: The root path object to start walking from. When starting from > + * root, one can pass in object_get_root(). > + * @path: The sub-path to lookup, must be an non-empty string starts with "/". > + * > + * Returns: The container object specified by @path. > + * > + * NOTE: the function may impplicitly create internal containers when the > + * whole path is not yet created. It's the caller's responsibility to make > + * sure the path specified is always used as object containers, rather than > + * any other type of objects. > + */ The docs are a welcome addition, but at the same time the docs won't get read most of the time. With this in mind, IMHO, it is a conceptually *terrible* design for us to have a method called "get" which magically *creates* stuff as a side-effect of its calling. We'd be well served by fixing that design problem. If I look in the code at what calls we have to container_get, and more specifically what "path" values we pass, there are not actually that many: /objects /chardevs /unattached /machine /peripheral /peripheral-anon /backend /dr-connector Ignoring the last one, those other 7 containers are things we expect to exist in *every* system emulator. Second, every single one of them is a single level deep. IOW, the for() loop in container_get is effectively pointless. We can fix this by having a single method: void container_create_builtin(Object *root) which creates the 7 built-in standard containers we expect everywhere, with open coded object_new + add_child calls. Then all current users of container_get() can switch over to object_resolve_path, and container_get() can be eliminated. The 'dr-connector' creation can just be open-coded using object_new() in the spapr code. > Object *container_get(Object *root, const char *path) > { > Object *obj, *child; > @@ -31,6 +45,7 @@ Object *container_get(Object *root, const char *path) > int i; > > parts = g_strsplit(path, "/", 0); > + /* "path" must be an non-empty string starting with "/" */ > assert(parts != NULL && parts[0] != NULL && !parts[0][0]); > obj = root; > > @@ -40,6 +55,12 @@ Object *container_get(Object *root, const char *path) > child = object_new(TYPE_CONTAINER); > object_property_add_child(obj, parts[i], child); > object_unref(child); > + } else { > + /* > + * Each object within the path must be a container object > + * itself, including the object to be returned. > + */ > + assert(object_dynamic_cast(child, TYPE_CONTAINER)); > } > } > > -- > 2.45.0 > With regards, Daniel
On Tue, Nov 19, 2024 at 09:09:16AM +0100, Paolo Bonzini wrote: > Il mar 19 nov 2024, 00:06 Peter Xu <peterx@redhat.com> ha scritto: > > > On Mon, Nov 18, 2024 at 05:13:30PM -0500, Peter Xu wrote: > > > When used incorrectly, container_get() can silently create containers > > even > > > if the caller may not intend to do so. Add a rich document describing > > the > > > helper, as container_get() should only be used in path lookups. > > > > > > Add one object_dynamic_cast() check to make sure whatever objects the > > > helper walks will be a container object (including the one to be > > returned). > > > It is a programming error otherwise, hence assert that. > > > > > > It may make container_get() tiny slower than before, but the hope is the > > > change is neglictable, as object_class_dynamic_cast() has a fast path > > just > > > for similar leaf use case. > > > > Just a heads up: out of curiosity, I tried to see whether the fast path hit > > that I mentioned here (mostly, commit 793c96b54032 of Paolo's), and it > > didn't.. > > > > It's fundamentally because all TypeImpl was allocated dynamically from > > heap, including its type->name. > > > Ah, that was supposed to be the difference between type_register() and > type_register_static(). Ah... looks like they're the same now? As type_register_static() looks like a wrapper of type_register(). I gave it a shot on booting a Linux guest with some pretty generic devices, and see how much the pointer check hit. Until I got the root login prompt, I got 8 hits out of 35488. So it's indeed hard to yet hit.. at least with the current code base. :( > > Perhaps type->name could be allocated with g_intern_string()? And then if > object_dynamic_cast() is changed into a macro, with something like > > #define qemu_cache_interned_string(s) \ > (__builtin_constant_p(s) \ > ? ({ static const char *interned; \ > interned = interned ?: g_intern_static_string(s); }) \ > : g_intern_string(s)) > > as the third parameter. This allows object_dynamic_cast() to use a simple > pointer equality for type name comparison, and the same can be applied to > object_class_dynamic_cast(). Interesting to know this facility! Though, IIUC this may: - For builtin-consts, it grows 8 bytes for each call sites on the binary generated, even if (I think...) most of the sites are slow paths, and there're plenty of such calls.. - For non-builtin strings, g_intern_string() will add one more hash operation for the whole string (and per discussed previously, looks like the string can be not always short..). So I'm not 100% sure yet if this is what we want. Do we have known places that we care a lot on object[_class]_dynamic_cast() performance? I can give it some measurement if there is, otherwise I'm guessing whatever changes could fall into the noise.. then we can also leave that for later, knowing that the fast path will hardly hit for now, but that shouldn't be a major issue either, I assume. > > Whatever we do, we should do it before Rust code starts using > object_dynamic_cast! Thanks!
On Tue, Nov 19, 2024 at 10:03:22AM +0000, Daniel P. Berrangé wrote: > The docs are a welcome addition, but at the same time the docs won't get > read most of the time. > > With this in mind, IMHO, it is a conceptually *terrible* design for us to > have a method called "get" which magically *creates* stuff as a side-effect > of its calling. We'd be well served by fixing that design problem. > > If I look in the code at what calls we have to container_get, and more > specifically what "path" values we pass, there are not actually that many: > > /objects > /chardevs > /unattached > /machine > /peripheral > /peripheral-anon > /backend > /dr-connector > > > Ignoring the last one, those other 7 containers are things we expect > to exist in *every* system emulator. > > Second, every single one of them is a single level deep. IOW, the for() > loop in container_get is effectively pointless. > > We can fix this by having a single method: > > void container_create_builtin(Object *root) > > which creates the 7 built-in standard containers we expect > everywhere, with open coded object_new + add_child calls. > > Then all current users of container_get() can switch over > to object_resolve_path, and container_get() can be eliminated. > > The 'dr-connector' creation can just be open-coded using > object_new() in the spapr code. Yes I think this could make sense, also after I noticed that the assert I added may not always work.. Please ignore this series then, I'll prepare something else soon.
Il mar 19 nov 2024, 21:07 Peter Xu <peterx@redhat.com> ha scritto: > > Ah, that was supposed to be the difference between type_register() and > > type_register_static(). > > Ah... looks like they're the same now? As type_register_static() looks like > a wrapper of type_register(). > And pretty much have always been... Zhao looked into it recently. > Perhaps type->name could be allocated with g_intern_string()? And then if > > object_dynamic_cast() is changed into a macro, with something like > > > > #define qemu_cache_interned_string(s) \ > > (__builtin_constant_p(s) \ > > ? ({ static const char *interned; \ > > interned = interned ?: g_intern_static_string(s); }) \ > > : g_intern_string(s)) > > > > as the third parameter. This allows object_dynamic_cast() to use a simple > > pointer equality for type name comparison, and the same can be applied to > > object_class_dynamic_cast(). > > Interesting to know this facility! Though, IIUC this may: > > - For builtin-consts, it grows 8 bytes for each call sites on the binary > generated, even if (I think...) most of the sites are slow paths, and > there're plenty of such calls.. > Right, but all the fast paths will be here. Only a few will be of course. - For non-builtin strings, g_intern_string() will add one more hash > operation for the whole string (and per discussed previously, looks > like the string can be not always short..). > Yes, but non-const strings should be *really* rare and not fast paths. So I'm not 100% sure yet if this is what we want. > > Do we have known places that we care a lot on object[_class]_dynamic_cast() > performance? The easiest way to check is probably to print the type of every successful object_dynamic_cast and object_class_dynamic_cast. I suspect the result will be virtio-blk-device and/or scsi-hd, but maybe those already do an unsafe cast (pointer type cast) instead of object_dynamic_cast. I can give it some measurement if there is, otherwise I'm > guessing whatever changes could fall into the noise. Yes, probably. At most you can identify if there any heavy places out of the 34000 calls, and see if they can use an unsafe cast. Paolo
On Tue, Nov 19, 2024 at 09:30:09PM +0100, Paolo Bonzini wrote: > > > > Do we have known places that we care a lot on object[_class]_dynamic_cast() > > performance? > > The easiest way to check is probably to print the type of every successful > object_dynamic_cast and object_class_dynamic_cast. I suspect the result > will be virtio-blk-device and/or scsi-hd, but maybe those already do an > unsafe cast (pointer type cast) instead of object_dynamic_cast. Yes, it sounds more reasonable to me to optimize specific call sites so far rather than provides something generic.. Though it could still be a generic API so that devices can opt-in. Maybe still not as fast as an unsafe cast, though.. I think I'll leave that to block experts when it may needs some good measurements. > > I can give it some measurement if there is, otherwise I'm > > guessing whatever changes could fall into the noise. > > > Yes, probably. At most you can identify if there any heavy places out of > the 34000 calls, and see if they can use an unsafe cast. I can still trivially do this. I traced qemu using bpf and interestingly in my case close to half (over 10000+) of the calls are about ahci_irq_lower() from different higher level stack (yeah I used IDE in my setup.. with a split irqchi..), where it has: PCIDevice *pci_dev = (PCIDevice *) object_dynamic_cast(OBJECT(dev_state), TYPE_PCI_DEVICE); So IIUC that can be open to a unsafe cast too, but considering IDE is ODD FIXES stage, I'm not sure if I should send a patch at all. However I copied John regardless. Thanks,
Il mar 19 nov 2024, 22:43 Peter Xu <peterx@redhat.com> ha scritto: > > The easiest way to check is probably to print the type of every > successful > > object_dynamic_cast and object_class_dynamic_cast. I suspect the result > > will be virtio-blk-device and/or scsi-hd, but maybe those already do an > > unsafe cast (pointer type cast) instead of object_dynamic_cast. > > Yes, it sounds more reasonable to me to optimize specific call sites so far > rather than provides something generic. Though it could still be a > generic API so that devices can opt-in. One of the things that I am excited about for Rust is checking at compile time whether a cast is to a superclass, which makes it safe automatically. > I can give it some measurement if there is, otherwise I'm > > > guessing whatever changes could fall into the noise. > > > > > > Yes, probably. At most you can identify if there any heavy places out of > > the 34000 calls, and see if they can use an unsafe cast. > > I can still trivially do this. > > I traced qemu using bpf Nice! I want to know more. :)) A > and interestingly in my case close to half (over > 10000+) of the calls are about ahci_irq_lower() from different higher level > stack (yeah I used IDE in my setup.. with a split irqchi..), where it has: > > PCIDevice *pci_dev = (PCIDevice *) > object_dynamic_cast(OBJECT(dev_state), > > TYPE_PCI_DEVICE); > > So IIUC that can be open to a unsafe cast too Hmm no it can't because there's also sysbus AHCI. The fix would be to add an AHCIClass and make irq toggling into a method there but considering IDE is ODD FIXES stage, I'm not sure if I should send a > patch at all. However I copied John regardless. > Well, MAINTAINERS only says the kind of work that the maintainer is doing, you can always do more. However it seems like not a small amount, so maybe adding a comment is enough if somebody else wants to do it? Paolo > Thanks, > > -- > Peter Xu > >
On Wed, Nov 20, 2024 at 12:45:19PM +0100, Paolo Bonzini wrote: > Il mar 19 nov 2024, 22:43 Peter Xu <peterx@redhat.com> ha scritto: > > > > The easiest way to check is probably to print the type of every > > successful > > > object_dynamic_cast and object_class_dynamic_cast. I suspect the result > > > will be virtio-blk-device and/or scsi-hd, but maybe those already do an > > > unsafe cast (pointer type cast) instead of object_dynamic_cast. > > > > Yes, it sounds more reasonable to me to optimize specific call sites so far > > rather than provides something generic. > > Though it could still be a > > generic API so that devices can opt-in. > > > One of the things that I am excited about for Rust is checking at compile > time whether a cast is to a superclass, which makes it safe automatically. I see. However looks like it doesn't easily apply to the ahci example below, where it could conditionally fail the cast (and where I got it wrong..)? > > > I can give it some measurement if there is, otherwise I'm > > > > guessing whatever changes could fall into the noise. > > > > > > > > > Yes, probably. At most you can identify if there any heavy places out of > > > the 34000 calls, and see if they can use an unsafe cast. > > > > I can still trivially do this. > > > > I traced qemu using bpf > > > Nice! I want to know more. :)) I also only learned it yesterday, where I only used to use k*probes previously. :-) That's: $ cat qemu.bpf uprobe:/home/peterx/git/qemu/bin/qemu-system-x86_64:object_class_dynamic_cast { @out[ustack()]++; } $ sudo bpftrace --usdt-file-activation ./qemu.bpf > > > and interestingly in my case close to half (over > > 10000+) of the calls are about ahci_irq_lower() from different higher level > > stack (yeah I used IDE in my setup.. with a split irqchi..), where it has: > > > > PCIDevice *pci_dev = (PCIDevice *) > > object_dynamic_cast(OBJECT(dev_state), > > > > TYPE_PCI_DEVICE); > > > > So IIUC that can be open to a unsafe cast too > > > Hmm no it can't because there's also sysbus AHCI. The fix would be to add > an AHCIClass and make irq toggling into a method there Yep, I overlooked the lines of code later.. :( > > but considering IDE is ODD FIXES stage, I'm not sure if I should send a > > patch at all. However I copied John regardless. > > > > Well, MAINTAINERS only says the kind of work that the maintainer is doing, > you can always do more. However it seems like not a small amount, so maybe > adding a comment is enough if somebody else wants to do it? Can do.
diff --git a/qom/container.c b/qom/container.c index cfec92a944..ff6e35f837 100644 --- a/qom/container.c +++ b/qom/container.c @@ -24,6 +24,20 @@ static void container_register_types(void) type_register_static(&container_info); } +/** + * container_get(): Get the container object under specific path + * + * @root: The root path object to start walking from. When starting from + * root, one can pass in object_get_root(). + * @path: The sub-path to lookup, must be an non-empty string starts with "/". + * + * Returns: The container object specified by @path. + * + * NOTE: the function may impplicitly create internal containers when the + * whole path is not yet created. It's the caller's responsibility to make + * sure the path specified is always used as object containers, rather than + * any other type of objects. + */ Object *container_get(Object *root, const char *path) { Object *obj, *child; @@ -31,6 +45,7 @@ Object *container_get(Object *root, const char *path) int i; parts = g_strsplit(path, "/", 0); + /* "path" must be an non-empty string starting with "/" */ assert(parts != NULL && parts[0] != NULL && !parts[0][0]); obj = root; @@ -40,6 +55,12 @@ Object *container_get(Object *root, const char *path) child = object_new(TYPE_CONTAINER); object_property_add_child(obj, parts[i], child); object_unref(child); + } else { + /* + * Each object within the path must be a container object + * itself, including the object to be returned. + */ + assert(object_dynamic_cast(child, TYPE_CONTAINER)); } }
When used incorrectly, container_get() can silently create containers even if the caller may not intend to do so. Add a rich document describing the helper, as container_get() should only be used in path lookups. Add one object_dynamic_cast() check to make sure whatever objects the helper walks will be a container object (including the one to be returned). It is a programming error otherwise, hence assert that. It may make container_get() tiny slower than before, but the hope is the change is neglictable, as object_class_dynamic_cast() has a fast path just for similar leaf use case. Link: https://lore.kernel.org/r/87pln6ds8q.fsf@pond.sub.org Suggested-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> --- qom/container.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+)