Message ID | 20220714143907.25938-2-anthony.perard@citrix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | xen: check-endbr.sh fix and improvement | expand |
Hi Anthony, > On 14 Jul 2022, at 15:39, Anthony PERARD <anthony.perard@citrix.com> wrote: > > check-endbr.sh works well with gawk, but fails with mawk. The produced > $ALL file is smaller, it is missing 0x$vma_lo on every line. On mawk, > int(0x2A) just produce 0, instead of the expected value. > > The use of hexadecimal-constant in awk is an optional part of the > posix spec, and mawk doesn't seems to implemented. > > There is a way to convert an hexadecimal to a number be putting it in > a string, and awk as I understand is supposed to use strtod() to > convert the string to a number when needed. The expression > 'int("0x15") + 21' would produce the expected value in `mawk` but now > `gawk` won't convert the string to a number unless we use the option > "--non-decimal-data". > > So let's convert the hexadecimal number before using it in the awk > script. The shell as no issue with dealing with hexadecimal-constant > so we'll simply use the expression "$(( 0x15 ))" to convert the value > before using it in awk. > > Fixes: 4d037425dc ("x86: Build check for embedded endbr64 instructions") > Reported-by: Luca Fancellu <Luca.Fancellu@arm.com> > Reported-by: Mathieu Tarral <mathieu.tarral@protonmail.com> > Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Very nice solution Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com> Cheers Bertrand
On 14/07/2022 15:39, Anthony PERARD wrote: > check-endbr.sh works well with gawk, but fails with mawk. The produced > $ALL file is smaller, it is missing 0x$vma_lo on every line. On mawk, > int(0x2A) just produce 0, instead of the expected value. > > The use of hexadecimal-constant in awk is an optional part of the > posix spec, and mawk doesn't seems to implemented. > > There is a way to convert an hexadecimal to a number be putting it in > a string, and awk as I understand is supposed to use strtod() to > convert the string to a number when needed. The expression > 'int("0x15") + 21' would produce the expected value in `mawk` but now > `gawk` won't convert the string to a number unless we use the option > "--non-decimal-data". > > So let's convert the hexadecimal number before using it in the awk > script. The shell as no issue with dealing with hexadecimal-constant > so we'll simply use the expression "$(( 0x15 ))" to convert the value > before using it in awk. > > Fixes: 4d037425dc ("x86: Build check for embedded endbr64 instructions") > Reported-by: Luca Fancellu <Luca.Fancellu@arm.com> > Reported-by: Mathieu Tarral <mathieu.tarral@protonmail.com> > Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Thanks for doing this. You beat me to it. On policy first, we have https://gitlab.com/xen-project/xen/-/issues/26 open for tracking this bug. We should consider having Resolves xen-project/xen#26 in our list of tags, so Gitlab can properly cross-reference this fix. (I wonder if Resolves: works...) https://docs.gitlab.com/ee/user/project/issues/managing_issues.html#closing-issues-automatically is the full list of patterns available, but I think we want to keep Fixes: for it's current meaning. I also want to wait for the patchew CI run to complete because we've got several build environments which have been a fertile source of shell related bugs. ~Andrew
On 14/07/2022 15:39, Anthony PERARD wrote: > diff --git a/xen/tools/check-endbr.sh b/xen/tools/check-endbr.sh > index 552f233912..64fa9a56b7 100755 > --- a/xen/tools/check-endbr.sh > +++ b/xen/tools/check-endbr.sh > @@ -78,7 +78,7 @@ then > else > grep -aob -e "$(printf '\363\17\36\372')" -e "$(printf '\363\17\36\373')" \ > -e "$(printf '\146\17\37\1')" $TEXT_BIN > -fi | awk -F':' '{printf "%s%x\n", "'$vma_hi'", int(0x'$vma_lo') + $1}' > $ALL > +fi | awk -F':' '{printf "%s%x\n", "'$vma_hi'", int('$((0x$vma_lo))') + $1}' > $ALL > > # Wait for $VALID to become complete > wait I thought I'd found a cunning way to simply this, but alas. Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>, but this warrants a comment, so I've added this hunk too: diff --git a/xen/tools/check-endbr.sh b/xen/tools/check-endbr.sh index f633846b0f79..80955f74c71c 100755 --- a/xen/tools/check-endbr.sh +++ b/xen/tools/check-endbr.sh @@ -64,6 +64,11 @@ ${OBJDUMP} -j .text $1 -d -w | grep ' endbr64 *$' | cut -f 1 -d ':' > $VALID & # split the VMA in half so AWK's numeric addition is only working on 32 bit # numbers, which don't lose precision. # +# 4) MAWK doesn't support plain hex constants (an optional part of the POSIX +# spec), and GAWK and MAWK can't agree on how to work with hex constants +# in a string. Use the shell to convert $vma_lo to decimal before passing +# to AWK. +# eval $(${OBJDUMP} -j .text $1 -h | $AWK '$2 == ".text" {printf "vma_hi=%s\nvma_lo=%s\n", substr($4, 1, 8), substr($4, 9, 16)}')
On 14/07/2022 16:12, Andrew Cooper wrote: > On 14/07/2022 15:39, Anthony PERARD wrote: >> check-endbr.sh works well with gawk, but fails with mawk. The produced >> $ALL file is smaller, it is missing 0x$vma_lo on every line. On mawk, >> int(0x2A) just produce 0, instead of the expected value. >> >> The use of hexadecimal-constant in awk is an optional part of the >> posix spec, and mawk doesn't seems to implemented. >> >> There is a way to convert an hexadecimal to a number be putting it in >> a string, and awk as I understand is supposed to use strtod() to >> convert the string to a number when needed. The expression >> 'int("0x15") + 21' would produce the expected value in `mawk` but now >> `gawk` won't convert the string to a number unless we use the option >> "--non-decimal-data". >> >> So let's convert the hexadecimal number before using it in the awk >> script. The shell as no issue with dealing with hexadecimal-constant >> so we'll simply use the expression "$(( 0x15 ))" to convert the value >> before using it in awk. >> >> Fixes: 4d037425dc ("x86: Build check for embedded endbr64 instructions") >> Reported-by: Luca Fancellu <Luca.Fancellu@arm.com> >> Reported-by: Mathieu Tarral <mathieu.tarral@protonmail.com> >> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> > Thanks for doing this. You beat me to it. > > On policy first, we have https://gitlab.com/xen-project/xen/-/issues/26 > open for tracking this bug. > > We should consider having > > Resolves xen-project/xen#26 > > in our list of tags, so Gitlab can properly cross-reference this fix. > (I wonder if Resolves: works...) Yes it does. Gitlab successfully cross-referenced my dev branch ... > > https://docs.gitlab.com/ee/user/project/issues/managing_issues.html#closing-issues-automatically > is the full list of patterns available, but I think we want to keep > Fixes: for it's current meaning. > > > I also want to wait for the patchew CI run to complete ... pushed because patchew failed to pick the series up for some reason. ~Andrew
On 14.07.2022 16:39, Anthony PERARD wrote: > --- a/xen/tools/check-endbr.sh > +++ b/xen/tools/check-endbr.sh > @@ -78,7 +78,7 @@ then > else > grep -aob -e "$(printf '\363\17\36\372')" -e "$(printf '\363\17\36\373')" \ > -e "$(printf '\146\17\37\1')" $TEXT_BIN > -fi | awk -F':' '{printf "%s%x\n", "'$vma_hi'", int(0x'$vma_lo') + $1}' > $ALL > +fi | awk -F':' '{printf "%s%x\n", "'$vma_hi'", int('$((0x$vma_lo))') + $1}' > $ALL I'm afraid that's not portable to environments where sizeof(long) < 8. The shell isn't required to use wider than long for the evaluation. Jan
On 14/07/2022 19:59, Andrew Cooper wrote: > On 14/07/2022 16:12, Andrew Cooper wrote: >> On 14/07/2022 15:39, Anthony PERARD wrote: >>> check-endbr.sh works well with gawk, but fails with mawk. The produced >>> $ALL file is smaller, it is missing 0x$vma_lo on every line. On mawk, >>> int(0x2A) just produce 0, instead of the expected value. >>> >>> The use of hexadecimal-constant in awk is an optional part of the >>> posix spec, and mawk doesn't seems to implemented. >>> >>> There is a way to convert an hexadecimal to a number be putting it in >>> a string, and awk as I understand is supposed to use strtod() to >>> convert the string to a number when needed. The expression >>> 'int("0x15") + 21' would produce the expected value in `mawk` but now >>> `gawk` won't convert the string to a number unless we use the option >>> "--non-decimal-data". >>> >>> So let's convert the hexadecimal number before using it in the awk >>> script. The shell as no issue with dealing with hexadecimal-constant >>> so we'll simply use the expression "$(( 0x15 ))" to convert the value >>> before using it in awk. >>> >>> Fixes: 4d037425dc ("x86: Build check for embedded endbr64 instructions") >>> Reported-by: Luca Fancellu <Luca.Fancellu@arm.com> >>> Reported-by: Mathieu Tarral <mathieu.tarral@protonmail.com> >>> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> >> Thanks for doing this. You beat me to it. >> >> On policy first, we have https://gitlab.com/xen-project/xen/-/issues/26 >> open for tracking this bug. >> >> We should consider having >> >> Resolves xen-project/xen#26 >> >> in our list of tags, so Gitlab can properly cross-reference this fix. >> (I wonder if Resolves: works...) > Yes it does. Gitlab successfully cross-referenced my dev branch ... > >> https://docs.gitlab.com/ee/user/project/issues/managing_issues.html#closing-issues-automatically >> is the full list of patterns available, but I think we want to keep >> Fixes: for it's current meaning. >> >> >> I also want to wait for the patchew CI run to complete > ... pushed because patchew failed to pick the series up for some reason. This series is now fully acked/reviewed and ready, and passed CI (well - the bits of CI which aren't broken for other reasons). Given the lack of objections, I'm going to use this patch alone as an experiment to see how Resolves: works through other bits of our workflow too. Unless someone objects very promptly. ~Andrew
On 15/07/2022 11:14, Jan Beulich wrote: > On 14.07.2022 16:39, Anthony PERARD wrote: >> --- a/xen/tools/check-endbr.sh >> +++ b/xen/tools/check-endbr.sh >> @@ -78,7 +78,7 @@ then >> else >> grep -aob -e "$(printf '\363\17\36\372')" -e "$(printf '\363\17\36\373')" \ >> -e "$(printf '\146\17\37\1')" $TEXT_BIN >> -fi | awk -F':' '{printf "%s%x\n", "'$vma_hi'", int(0x'$vma_lo') + $1}' > $ALL >> +fi | awk -F':' '{printf "%s%x\n", "'$vma_hi'", int('$((0x$vma_lo))') + $1}' > $ALL > I'm afraid that's not portable to environments where sizeof(long) < 8. > The shell isn't required to use wider than long for the evaluation. Yuck. This works at the moment in 32 bit builds because: ++ objdump -j .text xen-syms -h ++ awk '$2 == ".text" {printf "vma_hi=%s\nvma_lo=%s\n", substr($4, 1, 8), substr($4, 9, 16)}' + eval vma_hi=ffff82d0 vma_lo=40200000 ++ vma_hi=ffff82d0 ++ vma_lo=40200000 so the top bit isn't set. And going from an 8/8 split to a 9/7 split doesn't work either because that uses 4 bits and we've only got 2 to play with given .text's 1G alignment. I know it's disgusting, but how about a BUILD_BUG_ON(XEN_VIRT_START & (1u << 31)) ? ~Andrew
On 15.07.2022 12:43, Andrew Cooper wrote: > On 15/07/2022 11:14, Jan Beulich wrote: >> On 14.07.2022 16:39, Anthony PERARD wrote: >>> --- a/xen/tools/check-endbr.sh >>> +++ b/xen/tools/check-endbr.sh >>> @@ -78,7 +78,7 @@ then >>> else >>> grep -aob -e "$(printf '\363\17\36\372')" -e "$(printf '\363\17\36\373')" \ >>> -e "$(printf '\146\17\37\1')" $TEXT_BIN >>> -fi | awk -F':' '{printf "%s%x\n", "'$vma_hi'", int(0x'$vma_lo') + $1}' > $ALL >>> +fi | awk -F':' '{printf "%s%x\n", "'$vma_hi'", int('$((0x$vma_lo))') + $1}' > $ALL >> I'm afraid that's not portable to environments where sizeof(long) < 8. >> The shell isn't required to use wider than long for the evaluation. > > Yuck. This works at the moment in 32 bit builds because: > > ++ objdump -j .text xen-syms -h > ++ awk '$2 == ".text" {printf "vma_hi=%s\nvma_lo=%s\n", substr($4, 1, > 8), substr($4, 9, 16)}' > + eval vma_hi=ffff82d0 vma_lo=40200000 > ++ vma_hi=ffff82d0 > ++ vma_lo=40200000 > > so the top bit isn't set. > > And going from an 8/8 split to a 9/7 split doesn't work either because > that uses 4 bits and we've only got 2 to play with given .text's 1G > alignment. Why does text alignment matter here? All we care about are offsets into the Xen image. An I guess we aren't really at risk of going beyond 256M in image size ... > I know it's disgusting, but how about a BUILD_BUG_ON(XEN_VIRT_START & > (1u << 31)) ? In the worst case, why not. But that would be an odd restriction on changes to the memory layout (which we've done in the past). Jan
On 15/07/2022 11:51, Jan Beulich wrote: > On 15.07.2022 12:43, Andrew Cooper wrote: >> On 15/07/2022 11:14, Jan Beulich wrote: >>> On 14.07.2022 16:39, Anthony PERARD wrote: >>>> --- a/xen/tools/check-endbr.sh >>>> +++ b/xen/tools/check-endbr.sh >>>> @@ -78,7 +78,7 @@ then >>>> else >>>> grep -aob -e "$(printf '\363\17\36\372')" -e "$(printf '\363\17\36\373')" \ >>>> -e "$(printf '\146\17\37\1')" $TEXT_BIN >>>> -fi | awk -F':' '{printf "%s%x\n", "'$vma_hi'", int(0x'$vma_lo') + $1}' > $ALL >>>> +fi | awk -F':' '{printf "%s%x\n", "'$vma_hi'", int('$((0x$vma_lo))') + $1}' > $ALL >>> I'm afraid that's not portable to environments where sizeof(long) < 8. >>> The shell isn't required to use wider than long for the evaluation. >> Yuck. This works at the moment in 32 bit builds because: >> >> ++ objdump -j .text xen-syms -h >> ++ awk '$2 == ".text" {printf "vma_hi=%s\nvma_lo=%s\n", substr($4, 1, >> 8), substr($4, 9, 16)}' >> + eval vma_hi=ffff82d0 vma_lo=40200000 >> ++ vma_hi=ffff82d0 >> ++ vma_lo=40200000 >> >> so the top bit isn't set. >> >> And going from an 8/8 split to a 9/7 split doesn't work either because >> that uses 4 bits and we've only got 2 to play with given .text's 1G >> alignment. > Why does text alignment matter here? All we care about are offsets > into the Xen image. An I guess we aren't really at risk of going > beyond 256M in image size ... Very good point, but it's not even the image size. It's only .text. Furthermore, we can have the safety check for that in this script too. Lemme see what I can do. >> I know it's disgusting, but how about a BUILD_BUG_ON(XEN_VIRT_START & >> (1u << 31)) ? > In the worst case, why not. But that would be an odd restriction on > changes to the memory layout (which we've done in the past). I did say disgusting... I'm entirely happy that it doesn't appear to be needed. ~Andrew
diff --git a/xen/tools/check-endbr.sh b/xen/tools/check-endbr.sh index 552f233912..64fa9a56b7 100755 --- a/xen/tools/check-endbr.sh +++ b/xen/tools/check-endbr.sh @@ -78,7 +78,7 @@ then else grep -aob -e "$(printf '\363\17\36\372')" -e "$(printf '\363\17\36\373')" \ -e "$(printf '\146\17\37\1')" $TEXT_BIN -fi | awk -F':' '{printf "%s%x\n", "'$vma_hi'", int(0x'$vma_lo') + $1}' > $ALL +fi | awk -F':' '{printf "%s%x\n", "'$vma_hi'", int('$((0x$vma_lo))') + $1}' > $ALL # Wait for $VALID to become complete wait
check-endbr.sh works well with gawk, but fails with mawk. The produced $ALL file is smaller, it is missing 0x$vma_lo on every line. On mawk, int(0x2A) just produce 0, instead of the expected value. The use of hexadecimal-constant in awk is an optional part of the posix spec, and mawk doesn't seems to implemented. There is a way to convert an hexadecimal to a number be putting it in a string, and awk as I understand is supposed to use strtod() to convert the string to a number when needed. The expression 'int("0x15") + 21' would produce the expected value in `mawk` but now `gawk` won't convert the string to a number unless we use the option "--non-decimal-data". So let's convert the hexadecimal number before using it in the awk script. The shell as no issue with dealing with hexadecimal-constant so we'll simply use the expression "$(( 0x15 ))" to convert the value before using it in awk. Fixes: 4d037425dc ("x86: Build check for embedded endbr64 instructions") Reported-by: Luca Fancellu <Luca.Fancellu@arm.com> Reported-by: Mathieu Tarral <mathieu.tarral@protonmail.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> --- xen/tools/check-endbr.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)