diff mbox series

[01/19] hw/core/loader.c: Fix type conflict of GLib function pointers

Message ID 44e9ef95ea6e8aa33560ca8580db0033623f0e91.1744787186.git.ktokunaga.mail@gmail.com (mailing list archive)
State New
Headers show
Series Enable QEMU TCI to run 32bit guests on browsers | expand

Commit Message

Kohei Tokunaga April 16, 2025, 8:14 a.m. UTC
On emscripten, function pointer casts can cause function call failure.
This commit fixes the function definition to match to the type of the
function call using g_list_sort_with_data.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
---
 hw/core/loader.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Daniel P. Berrangé April 16, 2025, 8:28 a.m. UTC | #1
On Wed, Apr 16, 2025 at 05:14:06PM +0900, Kohei Tokunaga wrote:
> On emscripten, function pointer casts can cause function call failure.
> This commit fixes the function definition to match to the type of the
> function call using g_list_sort_with_data.
> 
> Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
> ---
>  hw/core/loader.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/core/loader.c b/hw/core/loader.c
> index 2e35f0aa90..93a8b45d28 100644
> --- a/hw/core/loader.c
> +++ b/hw/core/loader.c
> @@ -1410,7 +1410,7 @@ typedef struct RomSec {
>   * work, but this way saves a little work later by avoiding
>   * dealing with "gaps" of 0 length.
>   */
> -static gint sort_secs(gconstpointer a, gconstpointer b)
> +static gint sort_secs(gconstpointer a, gconstpointer b, gpointer d)
>  {
>      RomSec *ra = (RomSec *) a;
>      RomSec *rb = (RomSec *) b;
> @@ -1463,7 +1463,7 @@ RomGap rom_find_largest_gap_between(hwaddr base, size_t size)
>      /* sentinel */
>      secs = add_romsec_to_list(secs, base + size, 1);
>  
> -    secs = g_list_sort(secs, sort_secs);
> +    secs = g_list_sort_with_data(secs, sort_secs, NULL);

I don't see what the problem is with the original code.

The commit message says we have a bad function cast, but the original
method decl is

  GList *g_list_sort(GList*list, GCompareFunc compare_func);

where the callback is

   typedef gint (*GCompareFunc)(gconstpointer a, gconstpointer b);

Our code complies with that GCompareFunc signature.

For comparison the new code uses:

  GList *g_list_sort_with_data(GList *list, GCompareDataFunc  compare_func, gpointer user_data);

where the callback is

  typedef gint (*GCompareDataFunc)(gconstpointer a, gconstpointer b, gpointer user_data);

which the new code complies with, but it is undesirable since we
have no use for the data parameter.

With regards,
Daniel
Paolo Bonzini April 16, 2025, 9 a.m. UTC | #2
Il mer 16 apr 2025, 10:29 Daniel P. Berrangé <berrange@redhat.com> ha
scritto:

> > -    secs = g_list_sort(secs, sort_secs);
> > +    secs = g_list_sort_with_data(secs, sort_secs, NULL);
>
> I don't see what the problem is with the original code.
>
> The commit message says we have a bad function cast, but the original
> method decl is
>
>   GList *g_list_sort(GList*list, GCompareFunc compare_func);
>
> where the callback is
>
>    typedef gint (*GCompareFunc)(gconstpointer a, gconstpointer b);
>
> Our code complies with that GCompareFunc signature.
>

The cast is inside glib; g_list_sort casts the function pointer and calls
g_list_sort_with_data.

I suggested this solution to Kohei because it's easy to check that we're
converting all users and not introducing new ones in the future (see
poisoning in patch 10).

Paolo

For comparison the new code uses:
>
>   GList *g_list_sort_with_data(GList *list, GCompareDataFunc
> compare_func, gpointer user_data);
>
> where the callback is
>
>   typedef gint (*GCompareDataFunc)(gconstpointer a, gconstpointer b,
> gpointer user_data);
>
> which the new code complies with, but it is undesirable since we
> have no use for the data parameter.
>
> With regards,
> Daniel
> --
> |: https://berrange.com      -o-
> https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-
> https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-
> https://www.instagram.com/dberrange :|
>
>
Daniel P. Berrangé April 16, 2025, 9:12 a.m. UTC | #3
On Wed, Apr 16, 2025 at 11:00:37AM +0200, Paolo Bonzini wrote:
> Il mer 16 apr 2025, 10:29 Daniel P. Berrangé <berrange@redhat.com> ha
> scritto:
> 
> > > -    secs = g_list_sort(secs, sort_secs);
> > > +    secs = g_list_sort_with_data(secs, sort_secs, NULL);
> >
> > I don't see what the problem is with the original code.
> >
> > The commit message says we have a bad function cast, but the original
> > method decl is
> >
> >   GList *g_list_sort(GList*list, GCompareFunc compare_func);
> >
> > where the callback is
> >
> >    typedef gint (*GCompareFunc)(gconstpointer a, gconstpointer b);
> >
> > Our code complies with that GCompareFunc signature.
> >
> 
> The cast is inside glib; g_list_sort casts the function pointer and calls
> g_list_sort_with_data.

Sigh, the commit message should have said that, as it reads like the
problem is in QEMU code, not glib code.

> I suggested this solution to Kohei because it's easy to check that we're
> converting all users and not introducing new ones in the future (see
> poisoning in patch 10).

It is easy to check this /one/ example, but this pattern of bad casts
is typical in glib with likely many more examples, so avoidance in this
way does not feel like a sustainable long term strategy to me.


With regards,
Daniel
Thomas Huth April 16, 2025, 9:14 a.m. UTC | #4
On 16/04/2025 11.00, Paolo Bonzini wrote:
> 
> 
> Il mer 16 apr 2025, 10:29 Daniel P. Berrangé <berrange@redhat.com 
> <mailto:berrange@redhat.com>> ha scritto:
> 
>      > -    secs = g_list_sort(secs, sort_secs);
>      > +    secs = g_list_sort_with_data(secs, sort_secs, NULL);
> 
>     I don't see what the problem is with the original code.
> 
>     The commit message says we have a bad function cast, but the original
>     method decl is
> 
>        GList *g_list_sort(GList*list, GCompareFunc compare_func);
> 
>     where the callback is
> 
>         typedef gint (*GCompareFunc)(gconstpointer a, gconstpointer b);
> 
>     Our code complies with that GCompareFunc signature.
> 
> 
> The cast is inside glib; g_list_sort casts the function pointer and calls 
> g_list_sort_with_data.

Did anybody already report this to the glib people? I guess it should get 
fixed there in the long run...?

  Thomas
Paolo Bonzini April 16, 2025, 9:31 a.m. UTC | #5
Il mer 16 apr 2025, 11:12 Daniel P. Berrangé <berrange@redhat.com> ha
scritto:

> > I suggested this solution to Kohei because it's easy to check that we're
> > converting all users and not introducing new ones in the future (see
> > poisoning in patch 10).
>
> It is easy to check this /one/ example, but this pattern of bad casts
> is typical in glib with likely many more examples, so avoidance in this
> way does not feel like a sustainable long term strategy to me.
>

If you refer to the general case of function casting then yes, I agree that
there could be other problems in the future but that would have to be
solved in glib.

However QEMU's use of glib is overall relatively limited and for now this
seems to be the only case we hit. What this patch is doing is both
effective at solving the immediate issue, and future proof.

Paolo


>
> With regards,
> Daniel
> --
> |: https://berrange.com      -o-
> https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-
> https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-
> https://www.instagram.com/dberrange :|
>
>
Philippe Mathieu-Daudé April 16, 2025, 9:35 a.m. UTC | #6
On 16/4/25 10:28, Daniel P. Berrangé wrote:
> On Wed, Apr 16, 2025 at 05:14:06PM +0900, Kohei Tokunaga wrote:
>> On emscripten, function pointer casts can cause function call failure.
>> This commit fixes the function definition to match to the type of the
>> function call using g_list_sort_with_data.
>>
>> Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
>> ---
>>   hw/core/loader.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/hw/core/loader.c b/hw/core/loader.c
>> index 2e35f0aa90..93a8b45d28 100644
>> --- a/hw/core/loader.c
>> +++ b/hw/core/loader.c
>> @@ -1410,7 +1410,7 @@ typedef struct RomSec {
>>    * work, but this way saves a little work later by avoiding
>>    * dealing with "gaps" of 0 length.
>>    */
>> -static gint sort_secs(gconstpointer a, gconstpointer b)
>> +static gint sort_secs(gconstpointer a, gconstpointer b, gpointer d)
>>   {
>>       RomSec *ra = (RomSec *) a;
>>       RomSec *rb = (RomSec *) b;
>> @@ -1463,7 +1463,7 @@ RomGap rom_find_largest_gap_between(hwaddr base, size_t size)
>>       /* sentinel */
>>       secs = add_romsec_to_list(secs, base + size, 1);
>>   
>> -    secs = g_list_sort(secs, sort_secs);
>> +    secs = g_list_sort_with_data(secs, sort_secs, NULL);
> 
> I don't see what the problem is with the original code.

IIUC the rationale is in patch #10 where Kohei poisons g_list_sort():

   On emscripten, function pointer casts can cause function call
   failure. g_list_sort and g_slist_sort performs this internally
   so can't be used on Emscripten. Instead, g_list_sort_with_data
   and g_slist_sort_with_data should be used.

> 
> The commit message says we have a bad function cast, but the original
> method decl is
> 
>    GList *g_list_sort(GList*list, GCompareFunc compare_func);
> 
> where the callback is
> 
>     typedef gint (*GCompareFunc)(gconstpointer a, gconstpointer b);
> 
> Our code complies with that GCompareFunc signature.
> 
> For comparison the new code uses:
> 
>    GList *g_list_sort_with_data(GList *list, GCompareDataFunc  compare_func, gpointer user_data);
> 
> where the callback is
> 
>    typedef gint (*GCompareDataFunc)(gconstpointer a, gconstpointer b, gpointer user_data);
> 
> which the new code complies with, but it is undesirable since we
> have no use for the data parameter.
> 
> With regards,
> Daniel
Kohei Tokunaga April 17, 2025, 9:07 a.m. UTC | #7
Hi Daniel and Philippe,

>>> On emscripten, function pointer casts can cause function call failure.
>>> This commit fixes the function definition to match to the type of the
>>> function call using g_list_sort_with_data.
>>>
>>> Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
>>> ---
>>>   hw/core/loader.c | 4 ++--
>>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/hw/core/loader.c b/hw/core/loader.c
>>> index 2e35f0aa90..93a8b45d28 100644
>>> --- a/hw/core/loader.c
>>> +++ b/hw/core/loader.c
>>> @@ -1410,7 +1410,7 @@ typedef struct RomSec {
>>>    * work, but this way saves a little work later by avoiding
>>>    * dealing with "gaps" of 0 length.
>>>    */
>>> -static gint sort_secs(gconstpointer a, gconstpointer b)
>>> +static gint sort_secs(gconstpointer a, gconstpointer b, gpointer d)
>>>   {
>>>       RomSec *ra = (RomSec *) a;
>>>       RomSec *rb = (RomSec *) b;
>>> @@ -1463,7 +1463,7 @@ RomGap rom_find_largest_gap_between(hwaddr base,
size_t size)
>>>       /* sentinel */
>>>       secs = add_romsec_to_list(secs, base + size, 1);
>>>
>>> -    secs = g_list_sort(secs, sort_secs);
>>> +    secs = g_list_sort_with_data(secs, sort_secs, NULL);
>>
>> I don't see what the problem is with the original code.
>
>IIUC the rationale is in patch #10 where Kohei poisons g_list_sort():
>
>   On emscripten, function pointer casts can cause function call
>   failure. g_list_sort and g_slist_sort performs this internally
>   so can't be used on Emscripten. Instead, g_list_sort_with_data
>   and g_slist_sort_with_data should be used.

Thank you for the feedback. As Philippe pointed out, the rationale is
explained in the patch #10. I'll make sure to include the explanation in
the commit message of this patch as well.
Kohei Tokunaga April 17, 2025, 9:18 a.m. UTC | #8
Hi Daniel, Paolo and Thomas,

> > > I suggested this solution to Kohei because it's easy to check that
we're
> > > converting all users and not introducing new ones in the future (see
> > > poisoning in patch 10).
> >
> > It is easy to check this /one/ example, but this pattern of bad casts
> > is typical in glib with likely many more examples, so avoidance in this
> > way does not feel like a sustainable long term strategy to me.
>
> If you refer to the general case of function casting then yes, I agree
that there could be other problems in the future but that would have to be
solved in glib.
>
> However QEMU's use of glib is overall relatively limited and for now this
seems to be the only case we hit. What this patch is doing is both
effective at solving the immediate issue, and future proof.

The current patch addresses the cases we've encountered so far, and the
poisoning will help catch the usage of them in future development.

That said, it's worth continuing to explore a more robust solution. I've
submitted an issue to the GLib repository to investigate a long-term fix[1].

[1] https://gitlab.gnome.org/GNOME/glib/-/issues/3670
diff mbox series

Patch

diff --git a/hw/core/loader.c b/hw/core/loader.c
index 2e35f0aa90..93a8b45d28 100644
--- a/hw/core/loader.c
+++ b/hw/core/loader.c
@@ -1410,7 +1410,7 @@  typedef struct RomSec {
  * work, but this way saves a little work later by avoiding
  * dealing with "gaps" of 0 length.
  */
-static gint sort_secs(gconstpointer a, gconstpointer b)
+static gint sort_secs(gconstpointer a, gconstpointer b, gpointer d)
 {
     RomSec *ra = (RomSec *) a;
     RomSec *rb = (RomSec *) b;
@@ -1463,7 +1463,7 @@  RomGap rom_find_largest_gap_between(hwaddr base, size_t size)
     /* sentinel */
     secs = add_romsec_to_list(secs, base + size, 1);
 
-    secs = g_list_sort(secs, sort_secs);
+    secs = g_list_sort_with_data(secs, sort_secs, NULL);
 
     for (it = g_list_first(secs); it; it = g_list_next(it)) {
         cand = (RomSec *) it->data;