Message ID | 20210130191719.7085-1-yury.norov@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | lib/find_bit: fast path for small bitmaps | expand |
[add David Laight <David.Laight@ACULAB.COM> ] On Sat, Jan 30, 2021 at 11:17:11AM -0800, Yury Norov wrote: > Bitmap operations are much simpler and faster in case of small bitmaps > which fit into a single word. In linux/bitmap.h we have a machinery that > allows compiler to replace actual function call with a few instructions > if bitmaps passed into the function are small and their size is known at > compile time. > > find_*_bit() API lacks this functionality; despite users will benefit from > it a lot. One important example is cpumask subsystem when > NR_CPUS <= BITS_PER_LONG. In the very best case, the compiler may replace > a find_*_bit() call for such a bitmap with a single ffs or ffz instruction. > > Tools is synchronized with new implementation where needed. > > v1: https://www.spinics.net/lists/kernel/msg3804727.html > v2: - employ GENMASK() for bitmaps; > - unify find_bit inliners in; > - address comments to v1; Comments so far: - increased image size (patch #8) - addressed by introducing CONFIG_FAST_PATH; - split tools and kernel parts - not clear why it's better. Anything else?
On Mon, Feb 15, 2021 at 01:30:44PM -0800, Yury Norov wrote: > [add David Laight <David.Laight@ACULAB.COM> ] > > On Sat, Jan 30, 2021 at 11:17:11AM -0800, Yury Norov wrote: > > Bitmap operations are much simpler and faster in case of small bitmaps > > which fit into a single word. In linux/bitmap.h we have a machinery that > > allows compiler to replace actual function call with a few instructions > > if bitmaps passed into the function are small and their size is known at > > compile time. > > > > find_*_bit() API lacks this functionality; despite users will benefit from > > it a lot. One important example is cpumask subsystem when > > NR_CPUS <= BITS_PER_LONG. In the very best case, the compiler may replace > > a find_*_bit() call for such a bitmap with a single ffs or ffz instruction. > > > > Tools is synchronized with new implementation where needed. > > > > v1: https://www.spinics.net/lists/kernel/msg3804727.html > > v2: - employ GENMASK() for bitmaps; > > - unify find_bit inliners in; > > - address comments to v1; > > Comments so far: > - increased image size (patch #8) - addressed by introducing > CONFIG_FAST_PATH; > - split tools and kernel parts - not clear why it's better. Because tools are user space programs and sometimes may not follow kernel specifics, so they are different logically and changes should be separated. > Anything else?
On Tue, Feb 16, 2021 at 11:14:23AM +0200, Andy Shevchenko wrote: > On Mon, Feb 15, 2021 at 01:30:44PM -0800, Yury Norov wrote: > > [add David Laight <David.Laight@ACULAB.COM> ] > > > > On Sat, Jan 30, 2021 at 11:17:11AM -0800, Yury Norov wrote: > > > Bitmap operations are much simpler and faster in case of small bitmaps > > > which fit into a single word. In linux/bitmap.h we have a machinery that > > > allows compiler to replace actual function call with a few instructions > > > if bitmaps passed into the function are small and their size is known at > > > compile time. > > > > > > find_*_bit() API lacks this functionality; despite users will benefit from > > > it a lot. One important example is cpumask subsystem when > > > NR_CPUS <= BITS_PER_LONG. In the very best case, the compiler may replace > > > a find_*_bit() call for such a bitmap with a single ffs or ffz instruction. > > > > > > Tools is synchronized with new implementation where needed. > > > > > > v1: https://www.spinics.net/lists/kernel/msg3804727.html > > > v2: - employ GENMASK() for bitmaps; > > > - unify find_bit inliners in; > > > - address comments to v1; > > > > Comments so far: > > - increased image size (patch #8) - addressed by introducing > > CONFIG_FAST_PATH; > > > - split tools and kernel parts - not clear why it's better. > > Because tools are user space programs and sometimes may not follow kernel > specifics, so they are different logically and changes should be separated. In this specific case tools follow kernel well. Nevertheless, if you think it's a blocker for the series, I can split. What option for tools is better for you - doubling the number of patches or squashing everything in a patch bomb?
On Tue, Feb 16, 2021 at 10:00:42AM -0800, Yury Norov wrote: > On Tue, Feb 16, 2021 at 11:14:23AM +0200, Andy Shevchenko wrote: > > On Mon, Feb 15, 2021 at 01:30:44PM -0800, Yury Norov wrote: > > > [add David Laight <David.Laight@ACULAB.COM> ] > > > > > > On Sat, Jan 30, 2021 at 11:17:11AM -0800, Yury Norov wrote: > > > > Bitmap operations are much simpler and faster in case of small bitmaps > > > > which fit into a single word. In linux/bitmap.h we have a machinery that > > > > allows compiler to replace actual function call with a few instructions > > > > if bitmaps passed into the function are small and their size is known at > > > > compile time. > > > > > > > > find_*_bit() API lacks this functionality; despite users will benefit from > > > > it a lot. One important example is cpumask subsystem when > > > > NR_CPUS <= BITS_PER_LONG. In the very best case, the compiler may replace > > > > a find_*_bit() call for such a bitmap with a single ffs or ffz instruction. > > > > > > > > Tools is synchronized with new implementation where needed. > > > > > > > > v1: https://www.spinics.net/lists/kernel/msg3804727.html > > > > v2: - employ GENMASK() for bitmaps; > > > > - unify find_bit inliners in; > > > > - address comments to v1; > > > > > > Comments so far: > > > - increased image size (patch #8) - addressed by introducing > > > CONFIG_FAST_PATH; > > > > > - split tools and kernel parts - not clear why it's better. > > > > Because tools are user space programs and sometimes may not follow kernel > > specifics, so they are different logically and changes should be separated. > > In this specific case tools follow kernel well. > > Nevertheless, if you think it's a blocker for the series, I can split. It's not a blocker from my side. But you make it harder to push like this, because you will need a tag from tools, which in my practice is quite hard to get -> blocker. My point is: don't make obstacles where we can avoid them. So, if tools won't take this, it won't block us. > What > option for tools is better for you - doubling the number of patches or > squashing everything in a patch bomb? Not a tools guy, but common sense tells me that the best approach is to follow kind of changes in the kernel (similar granularity).