diff mbox series

[v2,7/8] lib: move bsearch code

Message ID 87a20884-5a76-a664-dcc9-bd4becee40b3@suse.com (mailing list archive)
State Superseded
Headers show
Series xen: beginnings of moving library-like code into an archive | expand

Commit Message

Jan Beulich Oct. 23, 2020, 10:19 a.m. UTC
Convert this code to an inline function (backed by an instance in an
archive in case the compiler decides against inlining), which results
in not having it in x86 final binaries. This saves a little bit of dead
code.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: Make the function an extern inline in its header.
---
 xen/common/Makefile        |  1 -
 xen/common/bsearch.c       | 51 --------------------------------------
 xen/include/xen/compiler.h |  1 +
 xen/include/xen/lib.h      | 42 ++++++++++++++++++++++++++++++-
 xen/lib/Makefile           |  1 +
 xen/lib/bsearch.c          | 13 ++++++++++
 6 files changed, 56 insertions(+), 53 deletions(-)
 delete mode 100644 xen/common/bsearch.c
 create mode 100644 xen/lib/bsearch.c

Comments

Julien Grall Nov. 18, 2020, 6:09 p.m. UTC | #1
Hi Jan,

On 23/10/2020 11:19, Jan Beulich wrote:
> Convert this code to an inline function (backed by an instance in an
> archive in case the compiler decides against inlining), which results
> in not having it in x86 final binaries. This saves a little bit of dead
> code.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> v2: Make the function an extern inline in its header.
> ---
>   xen/common/Makefile        |  1 -
>   xen/common/bsearch.c       | 51 --------------------------------------
>   xen/include/xen/compiler.h |  1 +
>   xen/include/xen/lib.h      | 42 ++++++++++++++++++++++++++++++-
>   xen/lib/Makefile           |  1 +
>   xen/lib/bsearch.c          | 13 ++++++++++
>   6 files changed, 56 insertions(+), 53 deletions(-)
>   delete mode 100644 xen/common/bsearch.c
>   create mode 100644 xen/lib/bsearch.c
> 
> diff --git a/xen/common/Makefile b/xen/common/Makefile
> index 7bb779f780a1..d8519a2cc163 100644
> --- a/xen/common/Makefile
> +++ b/xen/common/Makefile
> @@ -1,6 +1,5 @@
>   obj-$(CONFIG_ARGO) += argo.o
>   obj-y += bitmap.o
> -obj-y += bsearch.o
>   obj-$(CONFIG_HYPFS_CONFIG) += config_data.o
>   obj-$(CONFIG_CORE_PARKING) += core_parking.o
>   obj-y += cpu.o
> diff --git a/xen/common/bsearch.c b/xen/common/bsearch.c
> deleted file mode 100644
> index 7090930aab5c..000000000000
> --- a/xen/common/bsearch.c
> +++ /dev/null
> @@ -1,51 +0,0 @@
> -/*
> - * A generic implementation of binary search for the Linux kernel
> - *
> - * Copyright (C) 2008-2009 Ksplice, Inc.
> - * Author: Tim Abbott <tabbott@ksplice.com>
> - *
> - * This program is free software; you can redistribute it and/or
> - * modify it under the terms of the GNU General Public License as
> - * published by the Free Software Foundation; version 2.
> - */
> -
> -#include <xen/lib.h>
> -
> -/*
> - * bsearch - binary search an array of elements
> - * @key: pointer to item being searched for
> - * @base: pointer to first element to search
> - * @num: number of elements
> - * @size: size of each element
> - * @cmp: pointer to comparison function
> - *
> - * This function does a binary search on the given array.  The
> - * contents of the array should already be in ascending sorted order
> - * under the provided comparison function.
> - *
> - * Note that the key need not have the same type as the elements in
> - * the array, e.g. key could be a string and the comparison function
> - * could compare the string with the struct's name field.  However, if
> - * the key and elements in the array are of the same type, you can use
> - * the same comparison function for both sort() and bsearch().
> - */
> -void *bsearch(const void *key, const void *base, size_t num, size_t size,
> -	      int (*cmp)(const void *key, const void *elt))
> -{
> -	size_t start = 0, end = num;
> -	int result;
> -
> -	while (start < end) {
> -		size_t mid = start + (end - start) / 2;
> -
> -		result = cmp(key, base + mid * size);
> -		if (result < 0)
> -			end = mid;
> -		else if (result > 0)
> -			start = mid + 1;
> -		else
> -			return (void *)base + mid * size;
> -	}
> -
> -	return NULL;
> -}
> diff --git a/xen/include/xen/compiler.h b/xen/include/xen/compiler.h
> index c0e0ee9f27be..2b7acdf3b188 100644
> --- a/xen/include/xen/compiler.h
> +++ b/xen/include/xen/compiler.h
> @@ -12,6 +12,7 @@
>   
>   #define inline        __inline__
>   #define always_inline __inline__ __attribute__ ((__always_inline__))
> +#define gnu_inline    __inline__ __attribute__ ((__gnu_inline__))

bsearch() is only used by Arm and I haven't seen anyone so far 
complaining about the perf of I/O emulation.

Therefore, I am not convinced that there is enough justification to 
introduce a GNU attribute just for this patch.

Cheers,
Jan Beulich Nov. 19, 2020, 10:27 a.m. UTC | #2
On 18.11.2020 19:09, Julien Grall wrote:
> On 23/10/2020 11:19, Jan Beulich wrote:
>> --- a/xen/include/xen/compiler.h
>> +++ b/xen/include/xen/compiler.h
>> @@ -12,6 +12,7 @@
>>   
>>   #define inline        __inline__
>>   #define always_inline __inline__ __attribute__ ((__always_inline__))
>> +#define gnu_inline    __inline__ __attribute__ ((__gnu_inline__))
> 
> bsearch() is only used by Arm and I haven't seen anyone so far 
> complaining about the perf of I/O emulation.
> 
> Therefore, I am not convinced that there is enough justification to 
> introduce a GNU attribute just for this patch.

Please settle this with Andrew: He had asked for the function to
become inline. I don't view making it static inline in the header
as an option here - if the compiler decides to not inline it, we
should not end up with multiple instances in different CUs. And
without making it static inline the attribute needs adding; at
least I'm unaware of an alternative which works with the various
compiler versions.

Jan
Julien Grall Nov. 23, 2020, 10:49 p.m. UTC | #3
Hi Jan,

On 19/11/2020 10:27, Jan Beulich wrote:
> On 18.11.2020 19:09, Julien Grall wrote:
>> On 23/10/2020 11:19, Jan Beulich wrote:
>>> --- a/xen/include/xen/compiler.h
>>> +++ b/xen/include/xen/compiler.h
>>> @@ -12,6 +12,7 @@
>>>    
>>>    #define inline        __inline__
>>>    #define always_inline __inline__ __attribute__ ((__always_inline__))
>>> +#define gnu_inline    __inline__ __attribute__ ((__gnu_inline__))
>>
>> bsearch() is only used by Arm and I haven't seen anyone so far
>> complaining about the perf of I/O emulation.
>>
>> Therefore, I am not convinced that there is enough justification to
>> introduce a GNU attribute just for this patch.
> 
> Please settle this with Andrew: He had asked for the function to
> become inline. I don't view making it static inline in the header
> as an option here - if the compiler decides to not inline it, we
> should not end up with multiple instances in different CUs.

That's the cons of static inline... but then why is it suddenly a 
problem with this helper?

> And
> without making it static inline the attribute needs adding; at
> least I'm unaware of an alternative which works with the various
> compiler versions.

The question we have to answer is: What is the gain with this approach?

If it is not quantifiable, then introducing compiler specific attribute 
is not an option.

IIRC, there are only two callers (all in Arm code) of this function. 
Even inlined, I don't believe you would drastically reduce the number of 
instructions compare to a full blown version. To be generous, I would 
say you may save ~20 instructions per copy.

Therefore, so far, the compiler specific attribute doesn't look 
justified to me. As usual, I am happy to be proven wrong.

Cheers,
Andrew Cooper Nov. 24, 2020, 12:40 a.m. UTC | #4
On 23/11/2020 22:49, Julien Grall wrote:
> Hi Jan,
>
> On 19/11/2020 10:27, Jan Beulich wrote:
>> On 18.11.2020 19:09, Julien Grall wrote:
>>> On 23/10/2020 11:19, Jan Beulich wrote:
>>>> --- a/xen/include/xen/compiler.h
>>>> +++ b/xen/include/xen/compiler.h
>>>> @@ -12,6 +12,7 @@
>>>>       #define inline        __inline__
>>>>    #define always_inline __inline__ __attribute__
>>>> ((__always_inline__))
>>>> +#define gnu_inline    __inline__ __attribute__ ((__gnu_inline__))
>>>
>>> bsearch() is only used by Arm and I haven't seen anyone so far
>>> complaining about the perf of I/O emulation.
>>>
>>> Therefore, I am not convinced that there is enough justification to
>>> introduce a GNU attribute just for this patch.
>>
>> Please settle this with Andrew: He had asked for the function to
>> become inline. I don't view making it static inline in the header
>> as an option here - if the compiler decides to not inline it, we
>> should not end up with multiple instances in different CUs.
>
> That's the cons of static inline... but then why is it suddenly a
> problem with this helper?
>
>> And
>> without making it static inline the attribute needs adding; at
>> least I'm unaware of an alternative which works with the various
>> compiler versions.
>
> The question we have to answer is: What is the gain with this approach?

Substantial.

>
> If it is not quantifiable, then introducing compiler specific
> attribute is not an option.
>
> IIRC, there are only two callers (all in Arm code) of this function.
> Even inlined, I don't believe you would drastically reduce the number
> of instructions compare to a full blown version. To be generous, I
> would say you may save ~20 instructions per copy.
>
> Therefore, so far, the compiler specific attribute doesn't look
> justified to me. As usual, I am happy to be proven wrong

There is a very good reason why this is the classic example used for
extern inline's in various libc's.

The gains are from the compiler being able to optimise away the function
pointer(s) entirely.  Instead of working on opaque objects, it can see
the accesses directly, implement compares as straight array reads, (for
sorting, the swap() call turns into memcpy()) and because it can see all
the memory accesses, doesn't have to assume that every call to cmp()
modifies arbitrary data in the array (i.e. doesn't have to reload the
objects from memory every iteration).

extern inline allows the compiler full flexibility to judge whether
inlining is a net win, based on optimisation settings and observing what
the practical memory access pattern would be from not inlining.

extern inline is the appropriate thing to use here, except for the big
note in the GCC manual saying "always use gnu_inline in this case" which
appears to be working around a change in the C99 standard which forces
any non-static inline to emit a body even when its not called, due to
rules about global symbols.

Therefore, Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

Some further observations:

For arch/arm/io.c, the handlers are sorted, so find_mmio_handler() will
be O(lg n), but it will surely be faster with the inlined version, and
this is the fastpath.

register_mmio_handler() OTOH is massively expensive, because sort()
turns the array into a heap and back into an array on every insertion,
just to insert an entry into an already sorted array.  It would be more
efficient to library-fy the work I did for VT-x MSR load/save lists
(again, extern inline) and reuse
"insert_$FOO_into_sorted_list_of_FOOs()" which is a search, single
memmove() to make a gap, and a memcpy() into place.

When you compile io.c with this patch in place, the delta is:

add/remove: 0/1 grow/shrink: 1/0 up/down: 92/-164 (-72)
Function                                     old     new   delta
try_handle_mmio                              720     812     +92
bsearch                                      164       -    -164
Total: Before=992489, After=992417, chg -0.01%

The reason cmp_mmio_handler (140 bytes) doesn't drop out is because it
is referenced by register_mmio_hanlder()'s call to sort().  All in all,
the inlined version is less than 1/3 the size of the out-of-lined
version, but I haven't characterised it further than that.


On a totally separate point,  I wonder if we'd be better off compiling
with -fgnu89-inline because I can't see any case we're we'd want the C99
inline semantics anywhere in Xen.

~Andrew
Jan Beulich Nov. 24, 2020, 9:39 a.m. UTC | #5
On 24.11.2020 01:40, Andrew Cooper wrote:
> On 23/11/2020 22:49, Julien Grall wrote:
>> On 19/11/2020 10:27, Jan Beulich wrote:
>>> On 18.11.2020 19:09, Julien Grall wrote:
>>>> On 23/10/2020 11:19, Jan Beulich wrote:
>>>>> --- a/xen/include/xen/compiler.h
>>>>> +++ b/xen/include/xen/compiler.h
>>>>> @@ -12,6 +12,7 @@
>>>>>       #define inline        __inline__
>>>>>    #define always_inline __inline__ __attribute__
>>>>> ((__always_inline__))
>>>>> +#define gnu_inline    __inline__ __attribute__ ((__gnu_inline__))
>>>>
>>>> bsearch() is only used by Arm and I haven't seen anyone so far
>>>> complaining about the perf of I/O emulation.
>>>>
>>>> Therefore, I am not convinced that there is enough justification to
>>>> introduce a GNU attribute just for this patch.
>>>
>>> Please settle this with Andrew: He had asked for the function to
>>> become inline. I don't view making it static inline in the header
>>> as an option here - if the compiler decides to not inline it, we
>>> should not end up with multiple instances in different CUs.
>>
>> That's the cons of static inline... but then why is it suddenly a
>> problem with this helper?
>>
>>> And
>>> without making it static inline the attribute needs adding; at
>>> least I'm unaware of an alternative which works with the various
>>> compiler versions.
>>
>> The question we have to answer is: What is the gain with this approach?
> 
> Substantial.
> 
>>
>> If it is not quantifiable, then introducing compiler specific
>> attribute is not an option.
>>
>> IIRC, there are only two callers (all in Arm code) of this function.
>> Even inlined, I don't believe you would drastically reduce the number
>> of instructions compare to a full blown version. To be generous, I
>> would say you may save ~20 instructions per copy.
>>
>> Therefore, so far, the compiler specific attribute doesn't look
>> justified to me. As usual, I am happy to be proven wrong
> 
> There is a very good reason why this is the classic example used for
> extern inline's in various libc's.
> 
> The gains are from the compiler being able to optimise away the function
> pointer(s) entirely.  Instead of working on opaque objects, it can see
> the accesses directly, implement compares as straight array reads, (for
> sorting, the swap() call turns into memcpy()) and because it can see all
> the memory accesses, doesn't have to assume that every call to cmp()
> modifies arbitrary data in the array (i.e. doesn't have to reload the
> objects from memory every iteration).
> 
> extern inline allows the compiler full flexibility to judge whether
> inlining is a net win, based on optimisation settings and observing what
> the practical memory access pattern would be from not inlining.
> 
> extern inline is the appropriate thing to use here, except for the big
> note in the GCC manual saying "always use gnu_inline in this case" which
> appears to be working around a change in the C99 standard which forces
> any non-static inline to emit a body even when its not called, due to
> rules about global symbols.
> 
> Therefore, Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

Thanks Andrew.

Julien - please clarify whether you're okay with Andrew's response,
or whether you continue to object the conversion to inline.

> On a totally separate point,  I wonder if we'd be better off compiling
> with -fgnu89-inline because I can't see any case we're we'd want the C99
> inline semantics anywhere in Xen.

I'm not sure about this, i.e. I wouldn't want to exclude such a
case appearing. I think using attributes is better in general, as
it allows fine grained control.

Jan
Julien Grall Nov. 24, 2020, 4:57 p.m. UTC | #6
Hi Andrew,

Thank you for the detailed explanation :).

On 24/11/2020 00:40, Andrew Cooper wrote:
> On 23/11/2020 22:49, Julien Grall wrote:
>> Hi Jan,
>>
>> On 19/11/2020 10:27, Jan Beulich wrote:
>>> On 18.11.2020 19:09, Julien Grall wrote:
>>>> On 23/10/2020 11:19, Jan Beulich wrote:
>>>>> --- a/xen/include/xen/compiler.h
>>>>> +++ b/xen/include/xen/compiler.h
>>>>> @@ -12,6 +12,7 @@
>>>>>        #define inline        __inline__
>>>>>     #define always_inline __inline__ __attribute__
>>>>> ((__always_inline__))
>>>>> +#define gnu_inline    __inline__ __attribute__ ((__gnu_inline__))
>>>>
>>>> bsearch() is only used by Arm and I haven't seen anyone so far
>>>> complaining about the perf of I/O emulation.
>>>>
>>>> Therefore, I am not convinced that there is enough justification to
>>>> introduce a GNU attribute just for this patch.
>>>
>>> Please settle this with Andrew: He had asked for the function to
>>> become inline. I don't view making it static inline in the header
>>> as an option here - if the compiler decides to not inline it, we
>>> should not end up with multiple instances in different CUs.
>>
>> That's the cons of static inline... but then why is it suddenly a
>> problem with this helper?
>>
>>> And
>>> without making it static inline the attribute needs adding; at
>>> least I'm unaware of an alternative which works with the various
>>> compiler versions.
>>
>> The question we have to answer is: What is the gain with this approach?
> 
> Substantial.
> 
>>
>> If it is not quantifiable, then introducing compiler specific
>> attribute is not an option.
>>
>> IIRC, there are only two callers (all in Arm code) of this function.
>> Even inlined, I don't believe you would drastically reduce the number
>> of instructions compare to a full blown version. To be generous, I
>> would say you may save ~20 instructions per copy.
>>
>> Therefore, so far, the compiler specific attribute doesn't look
>> justified to me. As usual, I am happy to be proven wrong
> 
> There is a very good reason why this is the classic example used for
> extern inline's in various libc's.
> 
> The gains are from the compiler being able to optimise away the function
> pointer(s) entirely.  Instead of working on opaque objects, it can see
> the accesses directly, implement compares as straight array reads, (for
> sorting, the swap() call turns into memcpy()) and because it can see all
> the memory accesses, doesn't have to assume that every call to cmp()
> modifies arbitrary data in the array (i.e. doesn't have to reload the
> objects from memory every iteration).
> 
> extern inline allows the compiler full flexibility to judge whether
> inlining is a net win, based on optimisation settings and observing what
> the practical memory access pattern would be from not inlining.
> 
> extern inline is the appropriate thing to use here, except for the big
> note in the GCC manual saying "always use gnu_inline in this case" which
> appears to be working around a change in the C99 standard which forces
> any non-static inline to emit a body even when its not called, due to
> rules about global symbols.
> 
> Therefore, Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
> 
> Some further observations:
> 
> For arch/arm/io.c, the handlers are sorted, so find_mmio_handler() will
> be O(lg n), but it will surely be faster with the inlined version, and
> this is the fastpath.
> 
> register_mmio_handler() OTOH is massively expensive, because sort()
> turns the array into a heap and back into an array on every insertion,
> just to insert an entry into an already sorted array.  It would be more
> efficient to library-fy the work I did for VT-x MSR load/save lists
> (again, extern inline) and reuse
> "insert_$FOO_into_sorted_list_of_FOOs()" which is a search, single
> memmove() to make a gap, and a memcpy() into place.
> 
> When you compile io.c with this patch in place, the delta is:
> 
> add/remove: 0/1 grow/shrink: 1/0 up/down: 92/-164 (-72)
> Function                                     old     new   delta
> try_handle_mmio                              720     812     +92
> bsearch                                      164       -    -164
> Total: Before=992489, After=992417, chg -0.01%
> 
> The reason cmp_mmio_handler (140 bytes) doesn't drop out is because it
> is referenced by register_mmio_hanlder()'s call to sort().  All in all,
> the inlined version is less than 1/3 the size of the out-of-lined
> version, but I haven't characterised it further than that.
> 
> 
> On a totally separate point,  I wonder if we'd be better off compiling
> with -fgnu89-inline because I can't see any case we're we'd want the C99
> inline semantics anywhere in Xen.

This was one of my point above. It feels that if we want to use the 
behavior in Xen, then it should be everywhere rather than just this helper.

Cheers,
Jan Beulich Dec. 7, 2020, 10:23 a.m. UTC | #7
On 24.11.2020 17:57, Julien Grall wrote:
> On 24/11/2020 00:40, Andrew Cooper wrote:
>> On a totally separate point,  I wonder if we'd be better off compiling
>> with -fgnu89-inline because I can't see any case we're we'd want the C99
>> inline semantics anywhere in Xen.
> 
> This was one of my point above. It feels that if we want to use the 
> behavior in Xen, then it should be everywhere rather than just this helper.

I'll be committing the series up to patch 6 in a minute. It remains
unclear to me whether your responses on this sub-thread are meant
to be an objection, or just a comment. Andrew gave his R-b despite
this separate consideration, and I now also have an ack from Wei
for the entire series. Please clarify.

Or actually I only thought I could commit a fair initial part of
the series - I'm still lacking Arm-side acks for patches 2 and 3
here.

Jan
Julien Grall Dec. 9, 2020, 9:41 a.m. UTC | #8
Hi Jan,

On 07/12/2020 10:23, Jan Beulich wrote:
> On 24.11.2020 17:57, Julien Grall wrote:
>> On 24/11/2020 00:40, Andrew Cooper wrote:
>>> On a totally separate point,  I wonder if we'd be better off compiling
>>> with -fgnu89-inline because I can't see any case we're we'd want the C99
>>> inline semantics anywhere in Xen.
>>
>> This was one of my point above. It feels that if we want to use the
>> behavior in Xen, then it should be everywhere rather than just this helper.
> 
> I'll be committing the series up to patch 6 in a minute. It remains
> unclear to me whether your responses on this sub-thread are meant
> to be an objection, or just a comment. Andrew gave his R-b despite
> this separate consideration, and I now also have an ack from Wei
> for the entire series. Please clarify.

It still feels strange to apply to one function and not the others... 
But I don't have a strong objection against the idea of using C99 
inlines in Xen.

IOW, I will neither Ack nor NAck this patch.

> 
> Or actually I only thought I could commit a fair initial part of
> the series - I'm still lacking Arm-side acks for patches 2 and 3
> here.

If you remember, I have asked if Anthony could review the build system 
because he worked on it recently.

Unfortunately, I haven't seen any answer so far... I have pinged him on IRC.

Cheers,
Bertrand Marquis Dec. 9, 2020, 2:27 p.m. UTC | #9
Hi Jan,

> On 9 Dec 2020, at 09:41, Julien Grall <julien@xen.org> wrote:
> 
> Hi Jan,
> 
> On 07/12/2020 10:23, Jan Beulich wrote:
>> On 24.11.2020 17:57, Julien Grall wrote:
>>> On 24/11/2020 00:40, Andrew Cooper wrote:
>>>> On a totally separate point,  I wonder if we'd be better off compiling
>>>> with -fgnu89-inline because I can't see any case we're we'd want the C99
>>>> inline semantics anywhere in Xen.
>>> 
>>> This was one of my point above. It feels that if we want to use the
>>> behavior in Xen, then it should be everywhere rather than just this helper.
>> I'll be committing the series up to patch 6 in a minute. It remains
>> unclear to me whether your responses on this sub-thread are meant
>> to be an objection, or just a comment. Andrew gave his R-b despite
>> this separate consideration, and I now also have an ack from Wei
>> for the entire series. Please clarify.
> 
> It still feels strange to apply to one function and not the others... But I don't have a strong objection against the idea of using C99 inlines in Xen.
> 
> IOW, I will neither Ack nor NAck this patch.

I think as Julien here: why doing this inline thing for this function and not the others
provided by the library ?
Could you explain the reasons for this or the use cases you expect ?

I see 2 usage of bsearch in arm code and I do not get why you are doing this for
bsearch and not for the other functions.

Regards
Bertrand

> 
>> Or actually I only thought I could commit a fair initial part of
>> the series - I'm still lacking Arm-side acks for patches 2 and 3
>> here.
> 
> If you remember, I have asked if Anthony could review the build system because he worked on it recently.
> 
> Unfortunately, I haven't seen any answer so far... I have pinged him on IRC.
> 
> Cheers,
> 
> -- 
> Julien Grall
>
Jan Beulich Dec. 9, 2020, 2:54 p.m. UTC | #10
On 09.12.2020 15:27, Bertrand Marquis wrote:
>> On 9 Dec 2020, at 09:41, Julien Grall <julien@xen.org> wrote:
>> On 07/12/2020 10:23, Jan Beulich wrote:
>>> On 24.11.2020 17:57, Julien Grall wrote:
>>>> On 24/11/2020 00:40, Andrew Cooper wrote:
>>>>> On a totally separate point,  I wonder if we'd be better off compiling
>>>>> with -fgnu89-inline because I can't see any case we're we'd want the C99
>>>>> inline semantics anywhere in Xen.
>>>>
>>>> This was one of my point above. It feels that if we want to use the
>>>> behavior in Xen, then it should be everywhere rather than just this helper.
>>> I'll be committing the series up to patch 6 in a minute. It remains
>>> unclear to me whether your responses on this sub-thread are meant
>>> to be an objection, or just a comment. Andrew gave his R-b despite
>>> this separate consideration, and I now also have an ack from Wei
>>> for the entire series. Please clarify.
>>
>> It still feels strange to apply to one function and not the others... But I don't have a strong objection against the idea of using C99 inlines in Xen.
>>
>> IOW, I will neither Ack nor NAck this patch.
> 
> I think as Julien here: why doing this inline thing for this function and not the others
> provided by the library ?
> Could you explain the reasons for this or the use cases you expect ?
> 
> I see 2 usage of bsearch in arm code and I do not get why you are doing this for
> bsearch and not for the other functions.

May I ask whether you read Andrew's quite exhaustive reply to Julien
asking about this? Apart from this, it was Andrew who had asked to
inline bsearch(), and I followed that request. The initial version
of this series didn't have any inlining of these library functions
(which, after all, also isn't the goal of the series).

Jan
Bertrand Marquis Dec. 9, 2020, 3:06 p.m. UTC | #11
Hi Jan,

> On 9 Dec 2020, at 14:54, Jan Beulich <jbeulich@suse.com> wrote:
> 
> On 09.12.2020 15:27, Bertrand Marquis wrote:
>>> On 9 Dec 2020, at 09:41, Julien Grall <julien@xen.org> wrote:
>>> On 07/12/2020 10:23, Jan Beulich wrote:
>>>> On 24.11.2020 17:57, Julien Grall wrote:
>>>>> On 24/11/2020 00:40, Andrew Cooper wrote:
>>>>>> On a totally separate point,  I wonder if we'd be better off compiling
>>>>>> with -fgnu89-inline because I can't see any case we're we'd want the C99
>>>>>> inline semantics anywhere in Xen.
>>>>> 
>>>>> This was one of my point above. It feels that if we want to use the
>>>>> behavior in Xen, then it should be everywhere rather than just this helper.
>>>> I'll be committing the series up to patch 6 in a minute. It remains
>>>> unclear to me whether your responses on this sub-thread are meant
>>>> to be an objection, or just a comment. Andrew gave his R-b despite
>>>> this separate consideration, and I now also have an ack from Wei
>>>> for the entire series. Please clarify.
>>> 
>>> It still feels strange to apply to one function and not the others... But I don't have a strong objection against the idea of using C99 inlines in Xen.
>>> 
>>> IOW, I will neither Ack nor NAck this patch.
>> 
>> I think as Julien here: why doing this inline thing for this function and not the others
>> provided by the library ?
>> Could you explain the reasons for this or the use cases you expect ?
>> 
>> I see 2 usage of bsearch in arm code and I do not get why you are doing this for
>> bsearch and not for the other functions.
> 
> May I ask whether you read Andrew's quite exhaustive reply to Julien
> asking about this? Apart from this, it was Andrew who had asked to
> inline bsearch(), and I followed that request. The initial version
> of this series didn't have any inlining of these library functions
> (which, after all, also isn't the goal of the series).

I just did (sorry missed that one in the history).

But seeing where this is called and the look of the code with this
change i do not really think that the complexity introduced by this
will make a real win in terms of performance/code size but it does
make this complex.

So I would rather have the inline part out but the code is right:
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>

so that this is unblocked.
Regards
Bertrand

> 
> Jan
diff mbox series

Patch

diff --git a/xen/common/Makefile b/xen/common/Makefile
index 7bb779f780a1..d8519a2cc163 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -1,6 +1,5 @@ 
 obj-$(CONFIG_ARGO) += argo.o
 obj-y += bitmap.o
-obj-y += bsearch.o
 obj-$(CONFIG_HYPFS_CONFIG) += config_data.o
 obj-$(CONFIG_CORE_PARKING) += core_parking.o
 obj-y += cpu.o
diff --git a/xen/common/bsearch.c b/xen/common/bsearch.c
deleted file mode 100644
index 7090930aab5c..000000000000
--- a/xen/common/bsearch.c
+++ /dev/null
@@ -1,51 +0,0 @@ 
-/*
- * A generic implementation of binary search for the Linux kernel
- *
- * Copyright (C) 2008-2009 Ksplice, Inc.
- * Author: Tim Abbott <tabbott@ksplice.com>
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public License as
- * published by the Free Software Foundation; version 2.
- */
-
-#include <xen/lib.h>
-
-/*
- * bsearch - binary search an array of elements
- * @key: pointer to item being searched for
- * @base: pointer to first element to search
- * @num: number of elements
- * @size: size of each element
- * @cmp: pointer to comparison function
- *
- * This function does a binary search on the given array.  The
- * contents of the array should already be in ascending sorted order
- * under the provided comparison function.
- *
- * Note that the key need not have the same type as the elements in
- * the array, e.g. key could be a string and the comparison function
- * could compare the string with the struct's name field.  However, if
- * the key and elements in the array are of the same type, you can use
- * the same comparison function for both sort() and bsearch().
- */
-void *bsearch(const void *key, const void *base, size_t num, size_t size,
-	      int (*cmp)(const void *key, const void *elt))
-{
-	size_t start = 0, end = num;
-	int result;
-
-	while (start < end) {
-		size_t mid = start + (end - start) / 2;
-
-		result = cmp(key, base + mid * size);
-		if (result < 0)
-			end = mid;
-		else if (result > 0)
-			start = mid + 1;
-		else
-			return (void *)base + mid * size;
-	}
-
-	return NULL;
-}
diff --git a/xen/include/xen/compiler.h b/xen/include/xen/compiler.h
index c0e0ee9f27be..2b7acdf3b188 100644
--- a/xen/include/xen/compiler.h
+++ b/xen/include/xen/compiler.h
@@ -12,6 +12,7 @@ 
 
 #define inline        __inline__
 #define always_inline __inline__ __attribute__ ((__always_inline__))
+#define gnu_inline    __inline__ __attribute__ ((__gnu_inline__))
 #define noinline      __attribute__((__noinline__))
 
 #define noreturn      __attribute__((__noreturn__))
diff --git a/xen/include/xen/lib.h b/xen/include/xen/lib.h
index 076bcfb67dbb..940d23755661 100644
--- a/xen/include/xen/lib.h
+++ b/xen/include/xen/lib.h
@@ -192,8 +192,48 @@  void dump_execstate(struct cpu_user_regs *);
 
 void init_constructors(void);
 
+/*
+ * bsearch - binary search an array of elements
+ * @key: pointer to item being searched for
+ * @base: pointer to first element to search
+ * @num: number of elements
+ * @size: size of each element
+ * @cmp: pointer to comparison function
+ *
+ * This function does a binary search on the given array.  The
+ * contents of the array should already be in ascending sorted order
+ * under the provided comparison function.
+ *
+ * Note that the key need not have the same type as the elements in
+ * the array, e.g. key could be a string and the comparison function
+ * could compare the string with the struct's name field.  However, if
+ * the key and elements in the array are of the same type, you can use
+ * the same comparison function for both sort() and bsearch().
+ */
+#ifndef BSEARCH_IMPLEMENTATION
+extern gnu_inline
+#endif
 void *bsearch(const void *key, const void *base, size_t num, size_t size,
-              int (*cmp)(const void *key, const void *elt));
+              int (*cmp)(const void *key, const void *elt))
+{
+    size_t start = 0, end = num;
+    int result;
+
+    while ( start < end )
+    {
+        size_t mid = start + (end - start) / 2;
+
+        result = cmp(key, base + mid * size);
+        if ( result < 0 )
+            end = mid;
+        else if ( result > 0 )
+            start = mid + 1;
+        else
+            return (void *)base + mid * size;
+    }
+
+    return NULL;
+}
 
 #endif /* __ASSEMBLY__ */
 
diff --git a/xen/lib/Makefile b/xen/lib/Makefile
index b469d2dff7b8..122eeb3d327b 100644
--- a/xen/lib/Makefile
+++ b/xen/lib/Makefile
@@ -1,6 +1,7 @@ 
 obj-y += ctors.o
 obj-$(CONFIG_X86) += x86/
 
+lib-y += bsearch.o
 lib-y += ctype.o
 lib-y += list-sort.o
 lib-y += parse-size.o
diff --git a/xen/lib/bsearch.c b/xen/lib/bsearch.c
new file mode 100644
index 000000000000..149f7feafd1f
--- /dev/null
+++ b/xen/lib/bsearch.c
@@ -0,0 +1,13 @@ 
+/*
+ * A generic implementation of binary search for the Linux kernel
+ *
+ * Copyright (C) 2008-2009 Ksplice, Inc.
+ * Author: Tim Abbott <tabbott@ksplice.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; version 2.
+ */
+
+#define BSEARCH_IMPLEMENTATION
+#include <xen/lib.h>