Message ID | 20220915150417.722975-19-glider@google.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Add KernelMemorySanitizer infrastructure | expand |
---------- Forwarded message --------- 发件人: youling257 <youling257@gmail.com> Date: 2022年10月20日周四 上午1:36 Subject: Re: [PATCH v7 18/43] instrumented.h: add KMSAN support To: <glider@google.com> Cc: <youling257@gmail.com> i using linux kernel 6.1rc1 on android, i use gcc12 build kernel 6.1 for android, CONFIG_KMSAN is not set. "instrumented.h: add KMSAN support" cause android bluetooth high CPU usage. git bisect linux kernel 6.1rc1, "instrumented.h: add KMSAN support" is a bad commit for my android. this is my kernel 6.1, revert include/linux/instrumented.h fix high cpu usage problem. https://github.com/youling257/android-mainline/commits/6.1
On Wed, 19 Oct 2022 at 10:37, youling 257 <youling257@gmail.com> wrote: > > > > ---------- Forwarded message --------- > 发件人: youling257 <youling257@gmail.com> > Date: 2022年10月20日周四 上午1:36 > Subject: Re: [PATCH v7 18/43] instrumented.h: add KMSAN support > To: <glider@google.com> > Cc: <youling257@gmail.com> > > > i using linux kernel 6.1rc1 on android, i use gcc12 build kernel 6.1 for android, CONFIG_KMSAN is not set. > "instrumented.h: add KMSAN support" cause android bluetooth high CPU usage. > git bisect linux kernel 6.1rc1, "instrumented.h: add KMSAN support" is a bad commit for my android. > > this is my kernel 6.1, revert include/linux/instrumented.h fix high cpu usage problem. > https://github.com/youling257/android-mainline/commits/6.1 What arch? If x86, can you try to revert only the change to instrument_get_user()? (I wonder if the u64 conversion is causing issues.)
arch x86, this's my revert, https://github.com/youling257/android-mainline/commit/401cbfa61cbfc20c87a5be8e2dda68ac5702389f i tried different revert, have to remove kmsan_copy_to_user. 2022-10-20 1:58 GMT+08:00, Marco Elver <elver@google.com>: > On Wed, 19 Oct 2022 at 10:37, youling 257 <youling257@gmail.com> wrote: >> >> >> >> ---------- Forwarded message --------- >> 发件人: youling257 <youling257@gmail.com> >> Date: 2022年10月20日周四 上午1:36 >> Subject: Re: [PATCH v7 18/43] instrumented.h: add KMSAN support >> To: <glider@google.com> >> Cc: <youling257@gmail.com> >> >> >> i using linux kernel 6.1rc1 on android, i use gcc12 build kernel 6.1 for >> android, CONFIG_KMSAN is not set. >> "instrumented.h: add KMSAN support" cause android bluetooth high CPU >> usage. >> git bisect linux kernel 6.1rc1, "instrumented.h: add KMSAN support" is a >> bad commit for my android. >> >> this is my kernel 6.1, revert include/linux/instrumented.h fix high cpu >> usage problem. >> https://github.com/youling257/android-mainline/commits/6.1 > > What arch? > If x86, can you try to revert only the change to > instrument_get_user()? (I wonder if the u64 conversion is causing > issues.) >
On Thu, Oct 20, 2022 at 03:29AM +0800, youling 257 wrote: [...] > > What arch? > > If x86, can you try to revert only the change to > > instrument_get_user()? (I wonder if the u64 conversion is causing > > issues.) > > > arch x86, this's my revert, > https://github.com/youling257/android-mainline/commit/401cbfa61cbfc20c87a5be8e2dda68ac5702389f > i tried different revert, have to remove kmsan_copy_to_user. There you reverted only instrument_put_user() - does it fix the issue? If not, can you try only something like this (only revert instrument_get_user()): diff --git a/include/linux/instrumented.h b/include/linux/instrumented.h index 501fa8486749..dbe3ec38d0e6 100644 --- a/include/linux/instrumented.h +++ b/include/linux/instrumented.h @@ -167,9 +167,6 @@ instrument_copy_from_user_after(const void *to, const void __user *from, */ #define instrument_get_user(to) \ ({ \ - u64 __tmp = (u64)(to); \ - kmsan_unpoison_memory(&__tmp, sizeof(__tmp)); \ - to = __tmp; \ }) Once we know which one of these is the issue, we can figure out a proper fix. Thanks, -- Marco
That is i did,i already test, remove "u64 __tmp…kmsan_unpoison_memory", no help. i only remove kmsan_copy_to_user, fix my issue. 2022-10-20 4:00 GMT+08:00, Marco Elver <elver@google.com>: > On Thu, Oct 20, 2022 at 03:29AM +0800, youling 257 wrote: > [...] >> > What arch? >> > If x86, can you try to revert only the change to >> > instrument_get_user()? (I wonder if the u64 conversion is causing >> > issues.) >> > >> arch x86, this's my revert, >> https://github.com/youling257/android-mainline/commit/401cbfa61cbfc20c87a5be8e2dda68ac5702389f >> i tried different revert, have to remove kmsan_copy_to_user. > > There you reverted only instrument_put_user() - does it fix the issue? > > If not, can you try only something like this (only revert > instrument_get_user()): > > diff --git a/include/linux/instrumented.h b/include/linux/instrumented.h > index 501fa8486749..dbe3ec38d0e6 100644 > --- a/include/linux/instrumented.h > +++ b/include/linux/instrumented.h > @@ -167,9 +167,6 @@ instrument_copy_from_user_after(const void *to, const > void __user *from, > */ > #define instrument_get_user(to) \ > ({ \ > - u64 __tmp = (u64)(to); \ > - kmsan_unpoison_memory(&__tmp, sizeof(__tmp)); \ > - to = __tmp; \ > }) > > > Once we know which one of these is the issue, we can figure out a proper > fix. > > Thanks, > > -- Marco >
On Thu, Oct 20, 2022 at 04:07AM +0800, youling 257 wrote: > That is i did,i already test, remove "u64 __tmp…kmsan_unpoison_memory", no help. > i only remove kmsan_copy_to_user, fix my issue. Ok - does only the below work (without the reverts)? diff --git a/include/linux/kmsan-checks.h b/include/linux/kmsan-checks.h index c4cae333deec..eb05caa8f523 100644 --- a/include/linux/kmsan-checks.h +++ b/include/linux/kmsan-checks.h @@ -73,8 +73,8 @@ static inline void kmsan_unpoison_memory(const void *address, size_t size) static inline void kmsan_check_memory(const void *address, size_t size) { } -static inline void kmsan_copy_to_user(void __user *to, const void *from, - size_t to_copy, size_t left) +static __always_inline void kmsan_copy_to_user(void __user *to, const void *from, + size_t to_copy, size_t left) { } ... because when you say only removing kmsan_copy_to_user() (from instrument_put_user()) works, it really doesn't make any sense. The only explanation would be if the compiler inlining is broken.
On Wed, Oct 19, 2022 at 1:07 PM youling 257 <youling257@gmail.com> wrote: > > That is i did,i already test, remove "u64 __tmp…kmsan_unpoison_memory", no help. > i only remove kmsan_copy_to_user, fix my issue. Do you have any profiling results that can help pinpoint functions that are affected by this change? Also, am I getting it right you are building x86 Android? > 2022-10-20 4:00 GMT+08:00, Marco Elver <elver@google.com>: > > On Thu, Oct 20, 2022 at 03:29AM +0800, youling 257 wrote: > > [...] > >> > What arch? > >> > If x86, can you try to revert only the change to > >> > instrument_get_user()? (I wonder if the u64 conversion is causing > >> > issues.) > >> > > >> arch x86, this's my revert, > >> https://github.com/youling257/android-mainline/commit/401cbfa61cbfc20c87a5be8e2dda68ac5702389f > >> i tried different revert, have to remove kmsan_copy_to_user. > > > > There you reverted only instrument_put_user() - does it fix the issue? > > > > If not, can you try only something like this (only revert > > instrument_get_user()): > > > > diff --git a/include/linux/instrumented.h b/include/linux/instrumented.h > > index 501fa8486749..dbe3ec38d0e6 100644 > > --- a/include/linux/instrumented.h > > +++ b/include/linux/instrumented.h > > @@ -167,9 +167,6 @@ instrument_copy_from_user_after(const void *to, const > > void __user *from, > > */ > > #define instrument_get_user(to) \ > > ({ \ > > - u64 __tmp = (u64)(to); \ > > - kmsan_unpoison_memory(&__tmp, sizeof(__tmp)); \ > > - to = __tmp; \ > > }) > > > > > > Once we know which one of these is the issue, we can figure out a proper > > fix. > > > > Thanks, > > > > -- Marco > > > > -- > You received this message because you are subscribed to the Google Groups "kasan-dev" group. > To unsubscribe from this group and stop receiving emails from it, send an email to kasan-dev+unsubscribe@googlegroups.com. > To view this discussion on the web visit https://groups.google.com/d/msgid/kasan-dev/CAOzgRdY6KSxDMRJ%2Bq2BWHs4hRQc5y-PZ2NYG%2B%2B-AMcUrO8YOgA%40mail.gmail.com. -- Alexander Potapenko Software Engineer Google Germany GmbH Erika-Mann-Straße, 33 80636 München Geschäftsführer: Paul Manicle, Liana Sebastian Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg
without the revert, this include/linux/kmsan-checks.h patch no help. busybox top, com.android.bluetooth cpu usage 12%. my kernel config, config-6.1.0-rc1-android-x86_64+ CPU: 3.6% usr 12.1% sys 0.0% nic 84.1% idle 0.0% io 0.0% irq 0.0% sirq Load average: 1.49 0.65 0.25 4/1336 4940 PID PPID USER STAT VSZ %VSZ CPU %CPU COMMAND 1381 926 1002 S 1265m 16.1 5 12.4 {droid.bluetooth} com.android.bluetooth 919 1 0 S< 187m 2.4 2 2.2 /system/bin/surfaceflinger 2220 926 10327 S 1882m 24.0 7 1.1 com.baidu.input 3163 926 10175 S 2153m 27.5 0 0.0 {rojekti.clipper} fi.rojekti.clipper 1250 926 1000 S< 2140m 27.3 0 0.0 system_server 2548 926 10125 S 2078m 26.5 4 0.0 {chiztech.swapps} com.schiztech.swapps 2516 926 10217 S 2058m 26.3 2 0.0 {ogle.android.gm} com.google.android.gm 2925 926 10318 S 1876m 24.0 1 0.0 {onelli.juicessh} com.sonelli.juicessh 926 1 0 S 1780m 22.7 7 0.0 {main} zygote 3374 3369 0 S 1541m 19.7 2 0.0 {main} com.topjohnwu.magisk:root 3079 926 10021 S 1477m 18.9 0 0.0 {ndroid.systemui} com.android.systemui 2022-10-20 5:36 GMT+08:00, Marco Elver <elver@google.com>: > On Thu, Oct 20, 2022 at 04:07AM +0800, youling 257 wrote: >> That is i did,i already test, remove "u64 __tmp…kmsan_unpoison_memory", no >> help. >> i only remove kmsan_copy_to_user, fix my issue. > > Ok - does only the below work (without the reverts)? > > diff --git a/include/linux/kmsan-checks.h b/include/linux/kmsan-checks.h > index c4cae333deec..eb05caa8f523 100644 > --- a/include/linux/kmsan-checks.h > +++ b/include/linux/kmsan-checks.h > @@ -73,8 +73,8 @@ static inline void kmsan_unpoison_memory(const void > *address, size_t size) > static inline void kmsan_check_memory(const void *address, size_t size) > { > } > -static inline void kmsan_copy_to_user(void __user *to, const void *from, > - size_t to_copy, size_t left) > +static __always_inline void kmsan_copy_to_user(void __user *to, const void > *from, > + size_t to_copy, size_t left) > { > } > > > ... because when you say only removing kmsan_copy_to_user() (from > instrument_put_user()) works, it really doesn't make any sense. The only > explanation would be if the compiler inlining is broken. >
On Wed, Oct 19, 2022 at 2:37 PM 'Marco Elver' via kasan-dev < kasan-dev@googlegroups.com> wrote: > On Thu, Oct 20, 2022 at 04:07AM +0800, youling 257 wrote: > > That is i did,i already test, remove "u64 __tmp…kmsan_unpoison_memory", > no help. > > i only remove kmsan_copy_to_user, fix my issue. > > Ok - does only the below work (without the reverts)? > > diff --git a/include/linux/kmsan-checks.h b/include/linux/kmsan-checks.h > index c4cae333deec..eb05caa8f523 100644 > --- a/include/linux/kmsan-checks.h > +++ b/include/linux/kmsan-checks.h > @@ -73,8 +73,8 @@ static inline void kmsan_unpoison_memory(const void > *address, size_t size) > static inline void kmsan_check_memory(const void *address, size_t size) > { > } > -static inline void kmsan_copy_to_user(void __user *to, const void *from, > - size_t to_copy, size_t left) > +static __always_inline void kmsan_copy_to_user(void __user *to, const > void *from, > + size_t to_copy, size_t left) > { > } > > > ... because when you say only removing kmsan_copy_to_user() (from > instrument_put_user()) works, it really doesn't make any sense. The only > explanation would be if the compiler inlining is broken. > > If what Marco suggests does not help, could you post the output of `nm -S vmlinux` with and without your revert so that we can see which functions were affected by the change? Unfortunately the top results are of no help, do you have the `perf` tool available in your system? > -- > You received this message because you are subscribed to the Google Groups > "kasan-dev" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to kasan-dev+unsubscribe@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/kasan-dev/Y1Bt%2BIa93mVV/lT3%40elver.google.com > . >
How to use perf tool? I use diff compare "nm -S vmlinux". android_x86:/ $ su android_x86:/ # diff /sdcard/with_vmlinux /sdcard/without_vmlinux --- /sdcard/with_vmlinux +++ /sdcard/without_vmlinux @@ -12951,7 +12951,7 @@ ffffffff81338400 00000000000000dc T compat_only_sysfs_link_entry_to_kobj ffffffff810884d0 00000000000002b7 T compat_ptrace_request ffffffff812b4e80 0000000000000025 T compat_ptr_ioctl -ffffffff811380d0 000000000000008d T compat_put_bitmap +ffffffff811380d0 000000000000008a T compat_put_bitmap ffffffff81a62180 000000000000000f t compat_raw_ioctl ffffffff81afa060 000000000000000f t compat_rawv6_ioctl ffffffff81090790 0000000000000018 T compat_restore_altstack @@ -14260,7 +14260,7 @@ ffffffff8167d380 00000000000000dd T cppc_set_enable ffffffff8167d460 0000000000000221 T cppc_set_perf ffffffff818cdbe0 000000000000004f t cppc_update_perf -ffffffff8102bba0 000000000000012b t cp_stat64 +ffffffff8102bba0 000000000000012a t cp_stat64 ffffffff81494810 0000000000000028 t cp_status_show ffffffff812a33e0 00000000000001b2 t cp_statx ffffffff82e1c934 0000000000000004 b cpu0_hotpluggable @@ -26993,7 +26993,6 @@ ffffffff821b8000 0000000000000038 d CSWTCH.102 ffffffff821cd6c0 0000000000000048 d CSWTCH.107 ffffffff821d4b80 0000000000000018 d CSWTCH.11 -ffffffff8203dd88 000000000000000c d CSWTCH.111 ffffffff82152de0 00000000000000a0 d CSWTCH.111 ffffffff82105ea0 0000000000000020 d CSWTCH.1116 ffffffff82105e80 0000000000000020 d CSWTCH.1121 @@ -27002,6 +27001,7 @@ ffffffff821d4b60 0000000000000018 d CSWTCH.12 ffffffff821b2c60 0000000000000024 d CSWTCH.120 ffffffff82033fc0 0000000000000068 d CSWTCH.123 +ffffffff8203dd88 000000000000000c d CSWTCH.123 ffffffff82014b80 00000000000000a0 d CSWTCH.125 ffffffff82014b40 0000000000000028 d CSWTCH.128 ffffffff82033f40 0000000000000064 d CSWTCH.128 @@ -27026,7 +27026,6 @@ ffffffff821c9b80 0000000000000018 d CSWTCH.189 ffffffff821b39a0 0000000000000028 d CSWTCH.19 ffffffff820055c0 0000000000000024 d CSWTCH.195 -ffffffff82208de0 0000000000000004 d CSWTCH.199 ffffffff82146127 0000000000000006 d CSWTCH.2 ffffffff82146380 0000000000000014 d CSWTCH.2 ffffffff8219f770 0000000000000018 d CSWTCH.20 @@ -27033,5 +27032,6 @@ ffffffff821d9c90 0000000000000018 d CSWTCH.20 ffffffff821b7e60 0000000000000128 d CSWTCH.202 +ffffffff82208de0 0000000000000004 d CSWTCH.202 ffffffff821b7cc0 0000000000000188 d CSWTCH.204 ffffffff821b7c80 0000000000000038 d CSWTCH.206 ffffffff821a4ac0 0000000000000006 d CSWTCH.207 @@ -27043,8 +27043,8 @@ ffffffff821b7be0 0000000000000038 d CSWTCH.214 ffffffff821c9940 0000000000000030 d CSWTCH.217 ffffffff8219f750 0000000000000018 d CSWTCH.22 -ffffffff82033580 0000000000000023 d CSWTCH.221 ffffffff8213de80 0000000000000028 d CSWTCH.222 +ffffffff82033580 0000000000000023 d CSWTCH.223 ffffffff82216c10 0000000000000014 d CSWTCH.225 ffffffff82216bf0 0000000000000014 d CSWTCH.232 ffffffff82216be0 0000000000000010 d CSWTCH.234 @@ -27052,7 +27052,7 @@ ffffffff821b3800 0000000000000188 d CSWTCH.24 ffffffff821f3340 0000000000000058 d CSWTCH.242 ffffffff82158c60 0000000000000030 d CSWTCH.244 -ffffffff821a9868 000000000000000a d CSWTCH.255 +ffffffff821a9868 000000000000000a d CSWTCH.258 ffffffff82106a20 0000000000000010 d CSWTCH.261 ffffffff821a42a0 0000000000000004 d CSWTCH.261 ffffffff8203d9e0 0000000000000030 d CSWTCH.265 @@ -27059,12 +27059,12 @@ ffffffff82126780 0000000000000028 d CSWTCH.27 ffffffff8214d360 0000000000000040 d CSWTCH.278 ffffffff82136f80 0000000000000024 d CSWTCH.28 -ffffffff821b3000 0000000000000024 d CSWTCH.281 -ffffffff821b2fc0 0000000000000024 d CSWTCH.282 -ffffffff8210ed40 000000000000002c d CSWTCH.289 -ffffffff8210ed00 000000000000002b d CSWTCH.290 -ffffffff8210ecc0 000000000000002c d CSWTCH.292 +ffffffff821b3000 0000000000000024 d CSWTCH.285 +ffffffff821b2fc0 0000000000000024 d CSWTCH.286 +ffffffff8210ed40 000000000000002c d CSWTCH.292 ffffffff821b3ae0 0000000000000020 d CSWTCH.292 +ffffffff8210ed00 000000000000002b d CSWTCH.293 +ffffffff8210ecc0 000000000000002c d CSWTCH.295 ffffffff8214d340 0000000000000020 d CSWTCH.296 ffffffff82146121 0000000000000006 d CSWTCH.3 ffffffff821b7ba0 0000000000000020 d CSWTCH.3 @@ -27075,17 +27075,17 @@ ffffffff821d3c40 0000000000000020 d CSWTCH.31 ffffffff820ff780 0000000000000004 d CSWTCH.318 ffffffff821b5cc0 0000000000000038 d CSWTCH.33 -ffffffff8214efa0 0000000000000010 d CSWTCH.334 +ffffffff8214efa0 0000000000000010 d CSWTCH.346 ffffffff8214a8c0 0000000000000018 d CSWTCH.35 -ffffffff8210ecb0 000000000000000c d CSWTCH.359 -ffffffff8202cae0 000000000000002c d CSWTCH.360 +ffffffff8210ecb0 000000000000000c d CSWTCH.362 ffffffff82016300 0000000000000020 d CSWTCH.363 ffffffff821fce00 0000000000000040 d CSWTCH.38 -ffffffff82032780 0000000000000074 d CSWTCH.387 +ffffffff8202cae0 000000000000002c d CSWTCH.381 +ffffffff82032780 0000000000000074 d CSWTCH.390 ffffffff82118100 0000000000000018 d CSWTCH.40 ffffffff821598a0 0000000000000020 d CSWTCH.41 -ffffffff82032760 0000000000000020 d CSWTCH.428 ffffffff82139a20 0000000000000038 d CSWTCH.43 +ffffffff82032760 0000000000000020 d CSWTCH.431 ffffffff82015d40 0000000000000040 d CSWTCH.45 ffffffff821399e0 0000000000000040 d CSWTCH.45 ffffffff822194e0 000000000000000c d CSWTCH.450 @@ -27092,5 +27092,5 @@ ffffffff8214d2e0 0000000000000048 d CSWTCH.459 -ffffffff8210eca0 000000000000000c d CSWTCH.464 +ffffffff8210eca0 000000000000000c d CSWTCH.467 ffffffff821399c0 0000000000000020 d CSWTCH.47 ffffffff8214a8e0 00000000000000c8 d CSWTCH.48 ffffffff82028ac0 0000000000000018 d CSWTCH.49 @@ -27097,8 +27097,8 @@ ffffffff821399a0 0000000000000020 d CSWTCH.49 -ffffffff82032740 0000000000000018 d CSWTCH.507 -ffffffff82032730 000000000000000c d CSWTCH.508 -ffffffff82032720 000000000000000c d CSWTCH.509 ffffffff82139960 0000000000000030 d CSWTCH.51 +ffffffff82032740 0000000000000018 d CSWTCH.510 +ffffffff82032730 000000000000000c d CSWTCH.511 +ffffffff82032720 000000000000000c d CSWTCH.512 ffffffff821595c0 0000000000000100 d CSWTCH.52 ffffffff821b6240 0000000000000018 d CSWTCH.52 ffffffff821371c0 0000000000000024 d CSWTCH.523 @@ -27125,7 +27125,7 @@ ffffffff82139680 0000000000000058 d CSWTCH.72 ffffffff82139620 0000000000000048 d CSWTCH.74 ffffffff8213c740 0000000000000020 d CSWTCH.74 -ffffffff821f04a0 0000000000000018 d CSWTCH.753 +ffffffff821f04a0 0000000000000018 d CSWTCH.766 ffffffff8200e4e0 0000000000000020 d CSWTCH.77 ffffffff821bffc0 0000000000000018 d CSWTCH.78 ffffffff8200e4c0 0000000000000020 d CSWTCH.79 1|android_x86:/ # 2022-10-21 2:14 GMT+08:00, Alexander Potapenko <glider@google.com>: > On Wed, Oct 19, 2022 at 2:37 PM 'Marco Elver' via kasan-dev < > kasan-dev@googlegroups.com> wrote: > >> On Thu, Oct 20, 2022 at 04:07AM +0800, youling 257 wrote: >> > That is i did,i already test, remove "u64 __tmp…kmsan_unpoison_memory", >> no help. >> > i only remove kmsan_copy_to_user, fix my issue. >> >> Ok - does only the below work (without the reverts)? >> >> diff --git a/include/linux/kmsan-checks.h b/include/linux/kmsan-checks.h >> index c4cae333deec..eb05caa8f523 100644 >> --- a/include/linux/kmsan-checks.h >> +++ b/include/linux/kmsan-checks.h >> @@ -73,8 +73,8 @@ static inline void kmsan_unpoison_memory(const void >> *address, size_t size) >> static inline void kmsan_check_memory(const void *address, size_t size) >> { >> } >> -static inline void kmsan_copy_to_user(void __user *to, const void *from, >> - size_t to_copy, size_t left) >> +static __always_inline void kmsan_copy_to_user(void __user *to, const >> void *from, >> + size_t to_copy, size_t >> left) >> { >> } >> >> >> ... because when you say only removing kmsan_copy_to_user() (from >> instrument_put_user()) works, it really doesn't make any sense. The only >> explanation would be if the compiler inlining is broken. >> >> > If what Marco suggests does not help, could you post the output of `nm -S > vmlinux` with and without your revert so that we can see which functions > were affected by the change? > > Unfortunately the top results are of no help, do you have the `perf` tool > available in your system? > > >> -- >> You received this message because you are subscribed to the Google Groups >> "kasan-dev" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to kasan-dev+unsubscribe@googlegroups.com. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/kasan-dev/Y1Bt%2BIa93mVV/lT3%40elver.google.com >> . >> > > > -- > Alexander Potapenko > Software Engineer > > Google Germany GmbH > Erika-Mann-Straße, 33 > 80636 München > > Geschäftsführer: Paul Manicle, Liana Sebastian > Registergericht und -nummer: Hamburg, HRB 86891 > Sitz der Gesellschaft: Hamburg >
On Thu, 20 Oct 2022 at 22:55, youling 257 <youling257@gmail.com> wrote: > > How to use perf tool? The simplest would be to try just "perf top" - and see which kernel functions consume most CPU cycles. I would suggest you compare both kernels, and see if you can spot a function which uses more cycles% in the problematic kernel.
PerfTop: 8253 irqs/sec kernel:75.3% exact: 100.0% lost: 0/0 drop: 0/17899 [4000Hz cycles], (all, 8 CPUs) --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 14.87% [kernel] [k] 0xffffffff941d1f37 6.71% [kernel] [k] 0xffffffff942016cf what is 0xffffffff941d1f37? 2022-10-21 14:16 GMT+08:00, Marco Elver <elver@google.com>: > On Thu, 20 Oct 2022 at 22:55, youling 257 <youling257@gmail.com> wrote: >> >> How to use perf tool? > > The simplest would be to try just "perf top" - and see which kernel > functions consume most CPU cycles. I would suggest you compare both > kernels, and see if you can spot a function which uses more cycles% in > the problematic kernel. >
On Thu, 20 Oct 2022 at 23:39, youling 257 <youling257@gmail.com> wrote: > > PerfTop: 8253 irqs/sec kernel:75.3% exact: 100.0% lost: 0/0 drop: > 0/17899 [4000Hz cycles], (all, 8 CPUs) > --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > 14.87% [kernel] [k] 0xffffffff941d1f37 > 6.71% [kernel] [k] 0xffffffff942016cf > > what is 0xffffffff941d1f37? You need to build with debug symbols: CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y Then it'll show function names. > 2022-10-21 14:16 GMT+08:00, Marco Elver <elver@google.com>: > > On Thu, 20 Oct 2022 at 22:55, youling 257 <youling257@gmail.com> wrote: > >> > >> How to use perf tool? > > > > The simplest would be to try just "perf top" - and see which kernel > > functions consume most CPU cycles. I would suggest you compare both > > kernels, and see if you can spot a function which uses more cycles% in > > the problematic kernel. > >
CONFIG_DEBUG_INFO=y CONFIG_AS_HAS_NON_CONST_LEB128=y # CONFIG_DEBUG_INFO_NONE is not set CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y # CONFIG_DEBUG_INFO_DWARF4 is not set # CONFIG_DEBUG_INFO_DWARF5 is not set # CONFIG_DEBUG_INFO_REDUCED is not set # CONFIG_DEBUG_INFO_COMPRESSED is not set # CONFIG_DEBUG_INFO_SPLIT is not set # CONFIG_DEBUG_INFO_BTF is not set # CONFIG_GDB_SCRIPTS is not set perf top still no function name. 12.90% [kernel] [k] 0xffffffff833dfa64 3.78% [kernel] [k] 0xffffffff8285b439 3.61% [kernel] [k] 0xffffffff83370254 2.32% [kernel] [k] 0xffffffff8337025b 1.88% bluetooth.default.so [.] 0x000000000000d09d 2022-10-21 15:37 GMT+08:00, Marco Elver <elver@google.com>: > On Thu, 20 Oct 2022 at 23:39, youling 257 <youling257@gmail.com> wrote: >> >> PerfTop: 8253 irqs/sec kernel:75.3% exact: 100.0% lost: 0/0 drop: >> 0/17899 [4000Hz cycles], (all, 8 CPUs) >> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- >> >> 14.87% [kernel] [k] 0xffffffff941d1f37 >> 6.71% [kernel] [k] 0xffffffff942016cf >> >> what is 0xffffffff941d1f37? > > You need to build with debug symbols: > CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y > > Then it'll show function names. > >> 2022-10-21 14:16 GMT+08:00, Marco Elver <elver@google.com>: >> > On Thu, 20 Oct 2022 at 22:55, youling 257 <youling257@gmail.com> wrote: >> >> >> >> How to use perf tool? >> > >> > The simplest would be to try just "perf top" - and see which kernel >> > functions consume most CPU cycles. I would suggest you compare both >> > kernels, and see if you can spot a function which uses more cycles% in >> > the problematic kernel. >> > >
On Fri, Oct 21, 2022 at 8:19 AM youling 257 <youling257@gmail.com> wrote: > CONFIG_DEBUG_INFO=y > CONFIG_AS_HAS_NON_CONST_LEB128=y > # CONFIG_DEBUG_INFO_NONE is not set > CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y > # CONFIG_DEBUG_INFO_DWARF4 is not set > # CONFIG_DEBUG_INFO_DWARF5 is not set > # CONFIG_DEBUG_INFO_REDUCED is not set > # CONFIG_DEBUG_INFO_COMPRESSED is not set > # CONFIG_DEBUG_INFO_SPLIT is not set > # CONFIG_DEBUG_INFO_BTF is not set > # CONFIG_GDB_SCRIPTS is not set > > perf top still no function name. > Will it help if you disable CONFIG_RANDOMIZE_BASE? (if it doesn't show the symbols, at least we'll be able to figure out the offending function by running nm) > > 12.90% [kernel] [k] 0xffffffff833dfa64 > 3.78% [kernel] [k] 0xffffffff8285b439 > 3.61% [kernel] [k] 0xffffffff83370254 > 2.32% [kernel] [k] 0xffffffff8337025b > 1.88% bluetooth.default.so [.] 0x000000000000d09d > > 2022-10-21 15:37 GMT+08:00, Marco Elver <elver@google.com>: > > On Thu, 20 Oct 2022 at 23:39, youling 257 <youling257@gmail.com> wrote: > >> > >> PerfTop: 8253 irqs/sec kernel:75.3% exact: 100.0% lost: 0/0 drop: > >> 0/17899 [4000Hz cycles], (all, 8 CPUs) > >> > --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > >> > >> 14.87% [kernel] [k] 0xffffffff941d1f37 > >> 6.71% [kernel] [k] 0xffffffff942016cf > >> > >> what is 0xffffffff941d1f37? > > > > You need to build with debug symbols: > > CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y > > > > Then it'll show function names. > > > >> 2022-10-21 14:16 GMT+08:00, Marco Elver <elver@google.com>: > >> > On Thu, 20 Oct 2022 at 22:55, youling 257 <youling257@gmail.com> > wrote: > >> >> > >> >> How to use perf tool? > >> > > >> > The simplest would be to try just "perf top" - and see which kernel > >> > functions consume most CPU cycles. I would suggest you compare both > >> > kernels, and see if you can spot a function which uses more cycles% in > >> > the problematic kernel. > >> > > > >
On October 21, 2022 10:02:05 AM PDT, Alexander Potapenko <glider@google.com> wrote: >On Fri, Oct 21, 2022 at 8:19 AM youling 257 <youling257@gmail.com> wrote: > >> CONFIG_DEBUG_INFO=y >> CONFIG_AS_HAS_NON_CONST_LEB128=y >> # CONFIG_DEBUG_INFO_NONE is not set >> CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y >> # CONFIG_DEBUG_INFO_DWARF4 is not set >> # CONFIG_DEBUG_INFO_DWARF5 is not set >> # CONFIG_DEBUG_INFO_REDUCED is not set >> # CONFIG_DEBUG_INFO_COMPRESSED is not set >> # CONFIG_DEBUG_INFO_SPLIT is not set >> # CONFIG_DEBUG_INFO_BTF is not set >> # CONFIG_GDB_SCRIPTS is not set >> >> perf top still no function name. >> >Will it help if you disable CONFIG_RANDOMIZE_BASE? >(if it doesn't show the symbols, at least we'll be able to figure out the >offending function by running nm) Is KALLSYMS needed? > > >> >> 12.90% [kernel] [k] 0xffffffff833dfa64 >> 3.78% [kernel] [k] 0xffffffff8285b439 >> 3.61% [kernel] [k] 0xffffffff83370254 >> 2.32% [kernel] [k] 0xffffffff8337025b >> 1.88% bluetooth.default.so [.] 0x000000000000d09d >> >> 2022-10-21 15:37 GMT+08:00, Marco Elver <elver@google.com>: >> > On Thu, 20 Oct 2022 at 23:39, youling 257 <youling257@gmail.com> wrote: >> >> >> >> PerfTop: 8253 irqs/sec kernel:75.3% exact: 100.0% lost: 0/0 drop: >> >> 0/17899 [4000Hz cycles], (all, 8 CPUs) >> >> >> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- >> >> >> >> 14.87% [kernel] [k] 0xffffffff941d1f37 >> >> 6.71% [kernel] [k] 0xffffffff942016cf >> >> >> >> what is 0xffffffff941d1f37? >> > >> > You need to build with debug symbols: >> > CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y >> > >> > Then it'll show function names. >> > >> >> 2022-10-21 14:16 GMT+08:00, Marco Elver <elver@google.com>: >> >> > On Thu, 20 Oct 2022 at 22:55, youling 257 <youling257@gmail.com> >> wrote: >> >> >> >> >> >> How to use perf tool? >> >> > >> >> > The simplest would be to try just "perf top" - and see which kernel >> >> > functions consume most CPU cycles. I would suggest you compare both >> >> > kernels, and see if you can spot a function which uses more cycles% in >> >> > the problematic kernel. >> >> > >> > >> > >
On Fri, Oct 21, 2022 at 8:19 AM youling 257 <youling257@gmail.com> wrote: > CONFIG_DEBUG_INFO=y > CONFIG_AS_HAS_NON_CONST_LEB128=y > # CONFIG_DEBUG_INFO_NONE is not set > CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y > # CONFIG_DEBUG_INFO_DWARF4 is not set > # CONFIG_DEBUG_INFO_DWARF5 is not set > # CONFIG_DEBUG_INFO_REDUCED is not set > # CONFIG_DEBUG_INFO_COMPRESSED is not set > # CONFIG_DEBUG_INFO_SPLIT is not set > # CONFIG_DEBUG_INFO_BTF is not set > # CONFIG_GDB_SCRIPTS is not set > > perf top still no function name. > > 12.90% [kernel] [k] 0xffffffff833dfa64 > I think I know what's going on. The two functions that differ with and without the patch were passing an incremented pointer to unsafe_put_user(), which is a macro, e.g.: unsafe_put_user((compat_ulong_t)m, umask++, Efault); Because that macro didn't evaluate its second parameter, "umask++" was passed to a call to kmsan_copy_to_user(), which resulted in an extra increment of umask. This probably violated some expectations of the userspace app, which in turn led to repetitive kernel calls. Could you please check if the patch below fixes the problem for you? diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h index 8bc614cfe21b9..1cc756eafa447 100644 --- a/arch/x86/include/asm/uaccess.h +++ b/arch/x86/include/asm/uaccess.h @@ -254,24 +254,25 @@ extern void __put_user_nocheck_8(void); #define __put_user_size(x, ptr, size, label) \ do { \ __typeof__(*(ptr)) __x = (x); /* eval x once */ \ - __chk_user_ptr(ptr); \ + __typeof__(ptr) __ptr = (ptr); /* eval ptr once */ \ + __chk_user_ptr(__ptr); \ switch (size) { \ case 1: \ - __put_user_goto(__x, ptr, "b", "iq", label); \ + __put_user_goto(__x, __ptr, "b", "iq", label); \ break; \ case 2: \ - __put_user_goto(__x, ptr, "w", "ir", label); \ + __put_user_goto(__x, __ptr, "w", "ir", label); \ break; \ case 4: \ - __put_user_goto(__x, ptr, "l", "ir", label); \ + __put_user_goto(__x, __ptr, "l", "ir", label); \ break; \ case 8: \ - __put_user_goto_u64(__x, ptr, label); \ + __put_user_goto_u64(__x, __ptr, label); \ break; \ default: \ __put_user_bad(); \ } \ - instrument_put_user(__x, ptr, size); \ + instrument_put_user(__x, __ptr, size); \ } while (0) #ifdef CONFIG_CC_HAS_ASM_GOTO_OUTPUT
I test this patch fix my problem. 2022-10-22 4:37 GMT+08:00, Alexander Potapenko <glider@google.com>: > On Fri, Oct 21, 2022 at 8:19 AM youling 257 <youling257@gmail.com> wrote: > >> CONFIG_DEBUG_INFO=y >> CONFIG_AS_HAS_NON_CONST_LEB128=y >> # CONFIG_DEBUG_INFO_NONE is not set >> CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y >> # CONFIG_DEBUG_INFO_DWARF4 is not set >> # CONFIG_DEBUG_INFO_DWARF5 is not set >> # CONFIG_DEBUG_INFO_REDUCED is not set >> # CONFIG_DEBUG_INFO_COMPRESSED is not set >> # CONFIG_DEBUG_INFO_SPLIT is not set >> # CONFIG_DEBUG_INFO_BTF is not set >> # CONFIG_GDB_SCRIPTS is not set >> >> perf top still no function name. >> >> 12.90% [kernel] [k] 0xffffffff833dfa64 >> > > I think I know what's going on. The two functions that differ with and > without the patch were passing an incremented pointer to unsafe_put_user(), > which is a macro, e.g.: > > unsafe_put_user((compat_ulong_t)m, umask++, Efault); > > Because that macro didn't evaluate its second parameter, "umask++" was > passed to a call to kmsan_copy_to_user(), which resulted in an extra > increment of umask. > This probably violated some expectations of the userspace app, which in > turn led to repetitive kernel calls. > > Could you please check if the patch below fixes the problem for you? > > diff --git a/arch/x86/include/asm/uaccess.h > b/arch/x86/include/asm/uaccess.h > index 8bc614cfe21b9..1cc756eafa447 100644 > --- a/arch/x86/include/asm/uaccess.h > +++ b/arch/x86/include/asm/uaccess.h > @@ -254,24 +254,25 @@ extern void __put_user_nocheck_8(void); > #define __put_user_size(x, ptr, size, label) \ > do { \ > __typeof__(*(ptr)) __x = (x); /* eval x once */ \ > - __chk_user_ptr(ptr); \ > + __typeof__(ptr) __ptr = (ptr); /* eval ptr once */ \ > + __chk_user_ptr(__ptr); \ > switch (size) { \ > case 1: \ > - __put_user_goto(__x, ptr, "b", "iq", label); \ > + __put_user_goto(__x, __ptr, "b", "iq", label); \ > break; \ > case 2: \ > - __put_user_goto(__x, ptr, "w", "ir", label); \ > + __put_user_goto(__x, __ptr, "w", "ir", label); \ > break; \ > case 4: \ > - __put_user_goto(__x, ptr, "l", "ir", label); \ > + __put_user_goto(__x, __ptr, "l", "ir", label); \ > break; \ > case 8: \ > - __put_user_goto_u64(__x, ptr, label); \ > + __put_user_goto_u64(__x, __ptr, label); \ > break; \ > default: \ > __put_user_bad(); \ > } \ > - instrument_put_user(__x, ptr, size); \ > + instrument_put_user(__x, __ptr, size); \ > } while (0) > > #ifdef CONFIG_CC_HAS_ASM_GOTO_OUTPUT >
diff --git a/include/linux/instrumented.h b/include/linux/instrumented.h index 9f1dba8f717b0..501fa84867494 100644 --- a/include/linux/instrumented.h +++ b/include/linux/instrumented.h @@ -2,7 +2,7 @@ /* * This header provides generic wrappers for memory access instrumentation that - * the compiler cannot emit for: KASAN, KCSAN. + * the compiler cannot emit for: KASAN, KCSAN, KMSAN. */ #ifndef _LINUX_INSTRUMENTED_H #define _LINUX_INSTRUMENTED_H @@ -10,6 +10,7 @@ #include <linux/compiler.h> #include <linux/kasan-checks.h> #include <linux/kcsan-checks.h> +#include <linux/kmsan-checks.h> #include <linux/types.h> /** @@ -117,6 +118,7 @@ instrument_copy_to_user(void __user *to, const void *from, unsigned long n) { kasan_check_read(from, n); kcsan_check_read(from, n); + kmsan_copy_to_user(to, from, n, 0); } /** @@ -151,6 +153,7 @@ static __always_inline void instrument_copy_from_user_after(const void *to, const void __user *from, unsigned long n, unsigned long left) { + kmsan_unpoison_memory(to, n - left); } /** @@ -162,10 +165,14 @@ instrument_copy_from_user_after(const void *to, const void __user *from, * * @to destination variable, may not be address-taken */ -#define instrument_get_user(to) \ -({ \ +#define instrument_get_user(to) \ +({ \ + u64 __tmp = (u64)(to); \ + kmsan_unpoison_memory(&__tmp, sizeof(__tmp)); \ + to = __tmp; \ }) + /** * instrument_put_user() - add instrumentation to put_user()-like macros * @@ -177,8 +184,9 @@ instrument_copy_from_user_after(const void *to, const void __user *from, * @ptr userspace pointer to copy to * @size number of bytes to copy */ -#define instrument_put_user(from, ptr, size) \ -({ \ +#define instrument_put_user(from, ptr, size) \ +({ \ + kmsan_copy_to_user(ptr, &from, sizeof(from), 0); \ }) #endif /* _LINUX_INSTRUMENTED_H */ diff --git a/include/linux/kmsan-checks.h b/include/linux/kmsan-checks.h index a6522a0c28df9..c4cae333deec5 100644 --- a/include/linux/kmsan-checks.h +++ b/include/linux/kmsan-checks.h @@ -46,6 +46,21 @@ void kmsan_unpoison_memory(const void *address, size_t size); */ void kmsan_check_memory(const void *address, size_t size); +/** + * kmsan_copy_to_user() - Notify KMSAN about a data transfer to userspace. + * @to: destination address in the userspace. + * @from: source address in the kernel. + * @to_copy: number of bytes to copy. + * @left: number of bytes not copied. + * + * If this is a real userspace data transfer, KMSAN checks the bytes that were + * actually copied to ensure there was no information leak. If @to belongs to + * the kernel space (which is possible for compat syscalls), KMSAN just copies + * the metadata. + */ +void kmsan_copy_to_user(void __user *to, const void *from, size_t to_copy, + size_t left); + #else static inline void kmsan_poison_memory(const void *address, size_t size, @@ -58,6 +73,10 @@ static inline void kmsan_unpoison_memory(const void *address, size_t size) static inline void kmsan_check_memory(const void *address, size_t size) { } +static inline void kmsan_copy_to_user(void __user *to, const void *from, + size_t to_copy, size_t left) +{ +} #endif diff --git a/mm/kmsan/hooks.c b/mm/kmsan/hooks.c index 6f3e64b0b61f8..5c0eb25d984d7 100644 --- a/mm/kmsan/hooks.c +++ b/mm/kmsan/hooks.c @@ -205,6 +205,44 @@ void kmsan_iounmap_page_range(unsigned long start, unsigned long end) kmsan_leave_runtime(); } +void kmsan_copy_to_user(void __user *to, const void *from, size_t to_copy, + size_t left) +{ + unsigned long ua_flags; + + if (!kmsan_enabled || kmsan_in_runtime()) + return; + /* + * At this point we've copied the memory already. It's hard to check it + * before copying, as the size of actually copied buffer is unknown. + */ + + /* copy_to_user() may copy zero bytes. No need to check. */ + if (!to_copy) + return; + /* Or maybe copy_to_user() failed to copy anything. */ + if (to_copy <= left) + return; + + ua_flags = user_access_save(); + if ((u64)to < TASK_SIZE) { + /* This is a user memory access, check it. */ + kmsan_internal_check_memory((void *)from, to_copy - left, to, + REASON_COPY_TO_USER); + } else { + /* Otherwise this is a kernel memory access. This happens when a + * compat syscall passes an argument allocated on the kernel + * stack to a real syscall. + * Don't check anything, just copy the shadow of the copied + * bytes. + */ + kmsan_internal_memmove_metadata((void *)to, (void *)from, + to_copy - left); + } + user_access_restore(ua_flags); +} +EXPORT_SYMBOL(kmsan_copy_to_user); + /* Functions from kmsan-checks.h follow. */ void kmsan_poison_memory(const void *address, size_t size, gfp_t flags) {