Message ID | CAABZP2wCsXG=qaZ68OwDDPA=uG-4Wb-e6WxMNuHNbJTZ1ruenA@mail.gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Possible problem in check-local-export during the kernel build process (RCU torture) | expand |
On Tue, Jun 28, 2022 at 9:55 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: > > Dear Masahiro: > > 1. The cause of the problem > When I am doing RCU torture test: > > #git clone https://kernel.source.codeaurora.cn/pub/scm/linux/kernel/git/paulmck/linux-rcu.git > #cd linux-rcu > #git checkout remotes/origin/pmladek.2022.06.15a > #./tools/testing/selftests/rcutorture/bin/torture.sh > > The kernel building report error something in both Dell PowerEdge R720 > and Thinkpad T14 (Amd). > > For example: > /mnt/rcu/linux-rcu/tools/testing/selftests/rcutorture/res/2022.06.27-10.42.37-torture# > find . -name Make.out|xargs grep Error > ./results-rcutorture-kasan/RUDE01/Make.out:make[2]: *** > [scripts/Makefile.build:257: kernel/trace/power-traces.o] Error 255 check-local-export stopped using the process substitution. Seet this commit: commit da4288b95baa1c7c9aa8a476f58b37eb238745b0 Author: Masahiro Yamada <masahiroy@kernel.org> Date: Wed Jun 8 10:11:00 2022 +0900 scripts/check-local-export: avoid 'wait $!' for process substitution You are debugging the stale Kbuild code. If your main interest is RCU torture, please narrow it down to a standalone script, not the entire Kbuild. > > 2. I trace the problem to check-local-export > I add some echo statement in Makefile.build > diff --git a/scripts/Makefile.build b/scripts/Makefile.build > index 1f01ac65c0cd..0d48a2d3efff 100644 > --- a/scripts/Makefile.build > +++ b/scripts/Makefile.build > @@ -227,13 +227,21 @@ cmd_check_local_export = > $(srctree)/scripts/check-local-export $@ > > define rule_cc_o_c > $(call cmd_and_fixdep,cc_o_c) > + echo "after fixdep $$? $<" > $(call cmd,gen_ksymdeps) > + echo "after gen_ksymdeps $$? $<" > $(call cmd,check_local_export) > + echo "after check_local_export $$? $<" > $(call cmd,checksrc) > + echo "after checksrc $$? $<" > $(call cmd,checkdoc) > + echo "after checkdoc $$? $<" > $(call cmd,gen_objtooldep) > + echo "after gen_objtooldep $$? $<" > $(call cmd,gen_symversions_c) > + echo "after gen_symversions $$? $<" > $(call cmd,record_mcount) > + echo "after record_mcount $$? $<" > endef > > Then I rerun the torture.sh > > The result show it is check_local_export did not continue in all failed builds. > > 3. I trace into wait statement in check-local-export > diff --git a/scripts/check-local-export b/scripts/check-local-export > index da745e2743b7..d35477d95bdc 100755 > --- a/scripts/check-local-export > +++ b/scripts/check-local-export > @@ -12,7 +12,7 @@ declare -A symbol_types > declare -a export_symbols > > exit_code=0 > - > +echo "check-local-export L15 ${0} ${1}" > while read value type name > do > # Skip the line if the number of fields is less than 3. > @@ -50,9 +50,10 @@ do > # done < <(${NM} --quiet ${1}) > done < <(${NM} ${1} 2>/dev/null || { echo "${0}: ${NM} failed" >&2; false; } ) > > +echo "check-local-export L53 ${0} ${1}" > # Catch error in the process substitution > wait $! > - > +echo "check-local-export L56 ${0} ${1} $! $?" > for name in "${export_symbols[@]}" > do > # nm(3) says "If lowercase, the symbol is usually local" > @@ -61,5 +62,9 @@ do > exit_code=1 > fi > done > - > +if [ ${exit_code} -ne 0 ] ; then > + echo "Zhouyi Zhou" > + echo ${exit_code} > +fi > +echo "check-local-export L69 $? ${exit_code}" > exit ${exit_code} > > Then I rerun the torture.sh > > The result show it is wait $! in all failed builds because in all failed cases, > there is L53, but no L56 > > 4. I look into source code of wait command > #wget http://ftp.gnu.org/gnu/bash/bash-5.0.tar.gz > #tar zxf tar zxf bash-5.0.tar.gz > I found that there are QUIT statements in realization of function wait_for > > 5. My Guess > wait statement in check-local-export may cause bash to quit > I am very interested in this problem, but I am a rookie, I am very > glad to proceed the investigation with your further directions. > > Sorry to have brought you so much trouble. > > > Kind Regard > Thank you very much > Zhouyi
On Tue, Jun 28, 2022 at 3:25 PM Masahiro Yamada <masahiroy@kernel.org> wrote: > > On Tue, Jun 28, 2022 at 9:55 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: > > > > Dear Masahiro: > > > > 1. The cause of the problem > > When I am doing RCU torture test: > > > > #git clone https://kernel.source.codeaurora.cn/pub/scm/linux/kernel/git/paulmck/linux-rcu.git > > #cd linux-rcu > > #git checkout remotes/origin/pmladek.2022.06.15a > > #./tools/testing/selftests/rcutorture/bin/torture.sh > > > > The kernel building report error something in both Dell PowerEdge R720 > > and Thinkpad T14 (Amd). > > > > For example: > > /mnt/rcu/linux-rcu/tools/testing/selftests/rcutorture/res/2022.06.27-10.42.37-torture# > > find . -name Make.out|xargs grep Error > > ./results-rcutorture-kasan/RUDE01/Make.out:make[2]: *** > > [scripts/Makefile.build:257: kernel/trace/power-traces.o] Error 255 > > > check-local-export stopped using the process substitution. > Seet this commit: > > > commit da4288b95baa1c7c9aa8a476f58b37eb238745b0 > Author: Masahiro Yamada <masahiroy@kernel.org> > Date: Wed Jun 8 10:11:00 2022 +0900 > > scripts/check-local-export: avoid 'wait $!' for process substitution > > > > > You are debugging the stale Kbuild code. Thank Masahiro for your help I patched da4288b95baa1c7c9aa8a476f58b37eb238745b0 to pmladek.2022.06.15a branch of linux-rcu, after 13 hours' RCU torture test, all the testcases are built successfully! Tested-by: Zhouyi Zhou <zhouzhouyi@gmail.com> > > If your main interest is RCU torture, > please narrow it down to a standalone script, > not the entire Kbuild. OK and Thank you and Paul both for your guidance and encouragement ;-) > > > > > > > > > > > > > 2. I trace the problem to check-local-export > > I add some echo statement in Makefile.build > > diff --git a/scripts/Makefile.build b/scripts/Makefile.build > > index 1f01ac65c0cd..0d48a2d3efff 100644 > > --- a/scripts/Makefile.build > > +++ b/scripts/Makefile.build > > @@ -227,13 +227,21 @@ cmd_check_local_export = > > $(srctree)/scripts/check-local-export $@ > > > > define rule_cc_o_c > > $(call cmd_and_fixdep,cc_o_c) > > + echo "after fixdep $$? $<" > > $(call cmd,gen_ksymdeps) > > + echo "after gen_ksymdeps $$? $<" > > $(call cmd,check_local_export) > > + echo "after check_local_export $$? $<" > > $(call cmd,checksrc) > > + echo "after checksrc $$? $<" > > $(call cmd,checkdoc) > > + echo "after checkdoc $$? $<" > > $(call cmd,gen_objtooldep) > > + echo "after gen_objtooldep $$? $<" > > $(call cmd,gen_symversions_c) > > + echo "after gen_symversions $$? $<" > > $(call cmd,record_mcount) > > + echo "after record_mcount $$? $<" > > endef > > > > Then I rerun the torture.sh > > > > The result show it is check_local_export did not continue in all failed builds. > > > > 3. I trace into wait statement in check-local-export > > diff --git a/scripts/check-local-export b/scripts/check-local-export > > index da745e2743b7..d35477d95bdc 100755 > > --- a/scripts/check-local-export > > +++ b/scripts/check-local-export > > @@ -12,7 +12,7 @@ declare -A symbol_types > > declare -a export_symbols > > > > exit_code=0 > > - > > +echo "check-local-export L15 ${0} ${1}" > > while read value type name > > do > > # Skip the line if the number of fields is less than 3. > > @@ -50,9 +50,10 @@ do > > # done < <(${NM} --quiet ${1}) > > done < <(${NM} ${1} 2>/dev/null || { echo "${0}: ${NM} failed" >&2; false; } ) > > > > +echo "check-local-export L53 ${0} ${1}" > > # Catch error in the process substitution > > wait $! > > - > > +echo "check-local-export L56 ${0} ${1} $! $?" > > for name in "${export_symbols[@]}" > > do > > # nm(3) says "If lowercase, the symbol is usually local" > > @@ -61,5 +62,9 @@ do > > exit_code=1 > > fi > > done > > - > > +if [ ${exit_code} -ne 0 ] ; then > > + echo "Zhouyi Zhou" > > + echo ${exit_code} > > +fi > > +echo "check-local-export L69 $? ${exit_code}" > > exit ${exit_code} > > > > Then I rerun the torture.sh > > > > The result show it is wait $! in all failed builds because in all failed cases, > > there is L53, but no L56 > > > > 4. I look into source code of wait command > > #wget http://ftp.gnu.org/gnu/bash/bash-5.0.tar.gz > > #tar zxf tar zxf bash-5.0.tar.gz > > I found that there are QUIT statements in realization of function wait_for > > > > 5. My Guess > > wait statement in check-local-export may cause bash to quit > > I am very interested in this problem, but I am a rookie, I am very > > glad to proceed the investigation with your further directions. > > > > Sorry to have brought you so much trouble. > > > > > > Kind Regard > > Thank you very much > > Zhouyi > > > > -- > Best Regards > Masahiro Yamada Best Regards Zhouyi
diff --git a/scripts/Makefile.build b/scripts/Makefile.build index 1f01ac65c0cd..0d48a2d3efff 100644 --- a/scripts/Makefile.build +++ b/scripts/Makefile.build @@ -227,13 +227,21 @@ cmd_check_local_export = $(srctree)/scripts/check-local-export $@ define rule_cc_o_c $(call cmd_and_fixdep,cc_o_c) + echo "after fixdep $$? $<" $(call cmd,gen_ksymdeps) + echo "after gen_ksymdeps $$? $<" $(call cmd,check_local_export) + echo "after check_local_export $$? $<" $(call cmd,checksrc) + echo "after checksrc $$? $<" $(call cmd,checkdoc) + echo "after checkdoc $$? $<" $(call cmd,gen_objtooldep) + echo "after gen_objtooldep $$? $<" $(call cmd,gen_symversions_c) + echo "after gen_symversions $$? $<" $(call cmd,record_mcount) + echo "after record_mcount $$? $<" endef Then I rerun the torture.sh The result show it is check_local_export did not continue in all failed builds. 3. I trace into wait statement in check-local-export diff --git a/scripts/check-local-export b/scripts/check-local-export index da745e2743b7..d35477d95bdc 100755 --- a/scripts/check-local-export +++ b/scripts/check-local-export @@ -12,7 +12,7 @@ declare -A symbol_types declare -a export_symbols exit_code=0 - +echo "check-local-export L15 ${0} ${1}" while read value type name do # Skip the line if the number of fields is less than 3. @@ -50,9 +50,10 @@ do # done < <(${NM} --quiet ${1}) done < <(${NM} ${1} 2>/dev/null || { echo "${0}: ${NM} failed" >&2; false; } ) +echo "check-local-export L53 ${0} ${1}" # Catch error in the process substitution wait $! - +echo "check-local-export L56 ${0} ${1} $! $?" for name in "${export_symbols[@]}" do # nm(3) says "If lowercase, the symbol is usually local" @@ -61,5 +62,9 @@ do exit_code=1 fi done - +if [ ${exit_code} -ne 0 ] ; then + echo "Zhouyi Zhou" + echo ${exit_code} +fi +echo "check-local-export L69 $? ${exit_code}" exit ${exit_code}