Message ID | 20210320042753.69297-1-gshan@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | configure: Improve alias attribute check | expand |
On 20/03/2021 05.27, Gavin Shan wrote: > It's still possible that the wrong value is returned from the alias > of variable even if the program can be compiled without issue. This > improves the check by executing the binary to check the result. > > If alias attribute can't be working properly, the @target_page in > exec-vary.c will always return zeroes when we have the following gcc > version. > > # gcc --version > gcc (GCC) 11.0.0 20210210 (Red Hat 11.0.0-0) > > This abstracts the code from exec-vary.c and use it as indicator to > enable gcc alias attribute or not. > > Signed-off-by: Gavin Shan <gshan@redhat.com> > --- > configure | 34 ++++++++++++++++++++++++++++++---- > 1 file changed, 30 insertions(+), 4 deletions(-) > > diff --git a/configure b/configure > index f7d022a5db..8321f380d5 100755 > --- a/configure > +++ b/configure > @@ -75,6 +75,7 @@ fi > > TMPB="qemu-conf" > TMPC="${TMPDIR1}/${TMPB}.c" > +TMPC_B="${TMPDIR1}/${TMPB}_b.c" > TMPO="${TMPDIR1}/${TMPB}.o" > TMPCXX="${TMPDIR1}/${TMPB}.cxx" > TMPE="${TMPDIR1}/${TMPB}.exe" > @@ -4878,13 +4879,38 @@ fi > > attralias=no > cat > $TMPC << EOF > -int x = 1; > +static int x; > extern const int y __attribute__((alias("x"))); > -int main(void) { return 0; } > +extern int read_y(void); > +void write_x(int val); > + > +void write_x(int val) > +{ > + x = val; > +} > + > +int main(void) > +{ > + return read_y(); > +} > EOF > -if compile_prog "" "" ; then > - attralias=yes > +cat > $TMPC_B << EOF > +extern const int y; > +extern void write_x(int val); > +int read_y(void); > + > +int read_y(void) > +{ > + write_x(1); > + return y; > +} > +EOF > + > +TMPC+=" ${TMPC_B}" > +if compile_prog "" "" && ! $TMPE; then What about cross-compiling? Running an executable won't work if QEMU gets cross-compiled... Thomas
On 20/03/21 05:27, Gavin Shan wrote: > It's still possible that the wrong value is returned from the alias > of variable even if the program can be compiled without issue. This > improves the check by executing the binary to check the result. > > If alias attribute can't be working properly, the @target_page in > exec-vary.c will always return zeroes when we have the following gcc > version. > > # gcc --version > gcc (GCC) 11.0.0 20210210 (Red Hat 11.0.0-0) > > This abstracts the code from exec-vary.c and use it as indicator to > enable gcc alias attribute or not. > > +void write_x(int val); > + > +void write_x(int val) > +{ > + x = val; > +} > + > +int main(void) > +{ > + return read_y(); > +} I think this should be "read_y() == 1 ? 0 : 1". I can reproduce it with -flto -O2 but not without -flto, do you agree? Perhaps we can obtain the same optimization by wrapping reads of the page size in an inline __attribute__((const)) function. Richard, what do you think? Paolo
On 3/20/21 11:52 AM, Paolo Bonzini wrote: >> +int main(void) >> +{ >> + return read_y(); >> +} > > I think this should be "read_y() == 1 ? 0 : 1". As a testcase returning 0 on success, yes. > I can reproduce it with -flto -O2 but not without -flto, do you agree? Agreed. Replicated with a random recent gcc 11 snapshot. This is really annoying of lto. It's clear something needs to change though. > Perhaps we can obtain the same optimization by wrapping reads of the page size > in an inline __attribute__((const)) function. Richard, what do you think? I'll give it a shot and see what happens. r~
Hi Thomas, On 3/20/21 3:48 PM, Thomas Huth wrote: > On 20/03/2021 05.27, Gavin Shan wrote: >> It's still possible that the wrong value is returned from the alias >> of variable even if the program can be compiled without issue. This >> improves the check by executing the binary to check the result. >> >> If alias attribute can't be working properly, the @target_page in >> exec-vary.c will always return zeroes when we have the following gcc >> version. >> >> # gcc --version >> gcc (GCC) 11.0.0 20210210 (Red Hat 11.0.0-0) >> >> This abstracts the code from exec-vary.c and use it as indicator to >> enable gcc alias attribute or not. >> >> Signed-off-by: Gavin Shan <gshan@redhat.com> >> --- >> configure | 34 ++++++++++++++++++++++++++++++---- >> 1 file changed, 30 insertions(+), 4 deletions(-) >> >> diff --git a/configure b/configure >> index f7d022a5db..8321f380d5 100755 >> --- a/configure >> +++ b/configure >> @@ -75,6 +75,7 @@ fi >> TMPB="qemu-conf" >> TMPC="${TMPDIR1}/${TMPB}.c" >> +TMPC_B="${TMPDIR1}/${TMPB}_b.c" >> TMPO="${TMPDIR1}/${TMPB}.o" >> TMPCXX="${TMPDIR1}/${TMPB}.cxx" >> TMPE="${TMPDIR1}/${TMPB}.exe" >> @@ -4878,13 +4879,38 @@ fi >> attralias=no >> cat > $TMPC << EOF >> -int x = 1; >> +static int x; >> extern const int y __attribute__((alias("x"))); >> -int main(void) { return 0; } >> +extern int read_y(void); >> +void write_x(int val); >> + >> +void write_x(int val) >> +{ >> + x = val; >> +} >> + >> +int main(void) >> +{ >> + return read_y(); >> +} >> EOF >> -if compile_prog "" "" ; then >> - attralias=yes >> +cat > $TMPC_B << EOF >> +extern const int y; >> +extern void write_x(int val); >> +int read_y(void); >> + >> +int read_y(void) >> +{ >> + write_x(1); >> + return y; >> +} >> +EOF >> + >> +TMPC+=" ${TMPC_B}" >> +if compile_prog "" "" && ! $TMPE; then > > What about cross-compiling? Running an executable won't work if QEMU gets cross-compiled... > Executing the cross-compiled binary returns 126, which means we will disable gcc alias attribute for cross-compiling case with the following changes included into v2: int main(void) { return (read_y() == 1) ? 0 : 1; } if compile_prog "" "" && $TMPE >/dev/null 2>/dev/null; then attralias=yes fi Thanks, Gavin
Hi Paolo and Richard, On 3/21/21 9:33 AM, Richard Henderson wrote: > On 3/20/21 11:52 AM, Paolo Bonzini wrote: >>> +int main(void) >>> +{ >>> + return read_y(); >>> +} >> >> I think this should be "read_y() == 1 ? 0 : 1". > > As a testcase returning 0 on success, yes. > Ok. I will include the changes in v2. Also, I will wrap the lines, for example: int main(void) { return (read_y() == 1) ? 0 : 1; } if compile_prog "" "" && $TMPE >/dev/null 2>/dev/null; then attralias=yes fi >> I can reproduce it with -flto -O2 but not without -flto, do you agree? > > Agreed. Replicated with a random recent gcc 11 snapshot. > This is really annoying of lto. It's clear something needs to change though. > The command I used is: gcc -O2 -flto=auto config-temp.c config-temp-b.c -o config-temp.exe. Removing "-O2" or "-flto=auto" can make the gcc alias attribute workable again. >> Perhaps we can obtain the same optimization by wrapping reads of the page size in an inline __attribute__((const)) function. Richard, what do you think? > > I'll give it a shot and see what happens. > Thanks, Gavin
On 3/20/21 4:33 PM, Richard Henderson wrote: > On 3/20/21 11:52 AM, Paolo Bonzini wrote: >>> +int main(void) >>> +{ >>> + return read_y(); >>> +} >> >> I think this should be "read_y() == 1 ? 0 : 1". > > As a testcase returning 0 on success, yes. > >> I can reproduce it with -flto -O2 but not without -flto, do you agree? > > Agreed. Replicated with a random recent gcc 11 snapshot. > This is really annoying of lto. It's clear something needs to change though. > >> Perhaps we can obtain the same optimization by wrapping reads of the page >> size in an inline __attribute__((const)) function. Richard, what do you think? > > I'll give it a shot and see what happens. What exact version of gcc are you guys using? Something from rawhide that I can just install? So far I have failed to compile with gcc master with --enable-lto. Lots of missing symbols reported at link time. Therefore I've been unable to actually test what I intended to test. That said, I'm not hopeful that __attribute__((const)) will work. I have an idea that early inlining will inline tiny function calls too soon, before the attribute has a change to DTRT during CSE. At which point we're left with bare variable references and we're exactly where we would be if we did nothing special at all. Another workaround may be to avoid compiling exec-vary.c with -flto. I'm not sure that my meson fu is up to that. Paolo? I have filed a gcc bug report: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99696 Hopefully someone can address that before gcc 11 gets released. At which point we need do nothing in qemu. Aldy? r~
Il dom 21 mar 2021, 16:49 Richard Henderson <richard.henderson@linaro.org> ha scritto: > What exact version of gcc are you guys using? Something from rawhide that > I can just install? > I am using Fedora 34. I upgraded just to test this bug and it seems stable except that GNOME Shell extensions need an upgrade. However I haven't tried building all of QEMU, only the test case. So far I have failed to compile with gcc master with --enable-lto. Lots of > missing symbols reported at link time. Therefore I've been unable to > actually > test what I intended to test. > > That said, I'm not hopeful that __attribute__((const)) will work. I have > an > idea that early inlining will inline tiny function calls too soon, before > the > attribute has a change to DTRT during CSE. Yeah that's at least plausible. Another workaround may be to avoid compiling exec-vary.c with -flto. I'm > not > sure that my meson fu is up to that. Paolo? > You would have to define a static library. I have filed a gcc bug report: > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99696 > > Hopefully someone can address that before gcc 11 gets released. At which > point we need do nothing in qemu. Aldy? Good point, I can give it a shot too just to see how rusty I am... That would be the best outcome, though we would have to check LLVM as well. If const doesn't work it would indeed be prudent to include Gavin's configure check. Paolo > > r~ > >
On 3/21/21 10:50 AM, Paolo Bonzini wrote: > Another workaround may be to avoid compiling exec-vary.c with -flto. I'm not > sure that my meson fu is up to that. Paolo? > > You would have to define a static library. Ok. With an extra -fno-lto flag, or can I somehow remove -flto from the library's cflags? Or unset the meson b_lto variable? > I have filed a gcc bug report: > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99696 > <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99696> > > Hopefully someone can address that before gcc 11 gets released. At which > point we need do nothing in qemu. Aldy? > > > Good point, I can give it a shot too just to see how rusty I am... That would > be the best outcome, though we would have to check LLVM as well. If const > doesn't work it would indeed be prudent to include Gavin's configure check. So, I've reproduced the testcase failure with gcc 9.3 (ubuntu 20.04) as well. Which means that there are at least two releases for which this has not worked. I think Gavin's runtime test is unnecessary. We don't have to check the runtime results, we can just [ "$lto" = true ], and we fairly well know it'll fail. r~
Il dom 21 mar 2021, 18:34 Richard Henderson <richard.henderson@linaro.org> ha scritto: > On 3/21/21 10:50 AM, Paolo Bonzini wrote: > > Another workaround may be to avoid compiling exec-vary.c with > -flto. I'm not > > sure that my meson fu is up to that. Paolo? > > > > You would have to define a static library. > > Ok. With an extra -fno-lto flag, or can I somehow remove -flto from the > library's cflags? Or unset the meson b_lto variable? > -fno-lto should work, yes. b_lto tries to cater to other compilers, but we don't support anything but gcc-like drivers. > I have filed a gcc bug report: > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99696 > > > > Hopefully someone can address that before gcc 11 gets released. At > which > > point we need do nothing in qemu. Aldy? > > So, I've reproduced the testcase failure with gcc 9.3 (ubuntu 20.04) as > well. > Which means that there are at least two releases for which this has not > worked. > > I think Gavin's runtime test is unnecessary. We don't have to check the > runtime results, we can just [ "$lto" = true ], and we fairly well know > it'll fail. > Yeah, if anything the test can be used to re-enable attribute((alias)) once we know there are some fixed compilers. (Though it's quite ugly to have worse compilation when cross-compiling). Paolo > > r~ > >
HRM, what about biting the bullet and making exec-vary.c a C++ source?... Then instead of making it conditional an attribute((alias)), we make it conditional on having a C++ compiler. Making cpu-all.h compile as C++ would be complex, but we can isolate all the required declarations in a separate header file that does not need either osdep.h or glib-compat.h, and basically just have a global constructor in exec-vary.cc that forwards to a function in exec.c. Paolo Il dom 21 mar 2021, 18:43 Paolo Bonzini <pbonzini@redhat.com> ha scritto: > > > Il dom 21 mar 2021, 18:34 Richard Henderson <richard.henderson@linaro.org> > ha scritto: > >> On 3/21/21 10:50 AM, Paolo Bonzini wrote: >> > Another workaround may be to avoid compiling exec-vary.c with >> -flto. I'm not >> > sure that my meson fu is up to that. Paolo? >> > >> > You would have to define a static library. >> >> Ok. With an extra -fno-lto flag, or can I somehow remove -flto from the >> library's cflags? Or unset the meson b_lto variable? >> > > -fno-lto should work, yes. b_lto tries to cater to other compilers, but we > don't support anything but gcc-like drivers. > > > I have filed a gcc bug report: >> > >> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99696 >> > >> > Hopefully someone can address that before gcc 11 gets released. At >> which >> > point we need do nothing in qemu. Aldy? >> >> So, I've reproduced the testcase failure with gcc 9.3 (ubuntu 20.04) as >> well. >> Which means that there are at least two releases for which this has not >> worked. >> >> I think Gavin's runtime test is unnecessary. We don't have to check the >> runtime results, we can just [ "$lto" = true ], and we fairly well know >> it'll fail. >> > > Yeah, if anything the test can be used to re-enable attribute((alias)) > once we know there are some fixed compilers. (Though it's quite ugly to > have worse compilation when cross-compiling). > > Paolo > > >> >> r~ >> >>
On 3/21/21 11:46 AM, Paolo Bonzini wrote: > HRM, what about biting the bullet and making exec-vary.c a C++ source?... Then > instead of making it conditional an attribute((alias)), we make it conditional > on having a C++ compiler. Doesn't help. The gcc bug I filed talks about c++, because that's the closest analogy. But set_preferred_target_page_bits is called *much* later than a constructor. Though still before any use of the variable in question, for which we have an --enable-debug-tcg assertion. r~
Hi Richard and Paolo, On 3/22/21 5:23 AM, Richard Henderson wrote: > On 3/21/21 11:46 AM, Paolo Bonzini wrote: >> HRM, what about biting the bullet and making exec-vary.c a C++ source?... Then instead of making it conditional an attribute((alias)), we make it conditional on having a C++ compiler. > > Doesn't help. The gcc bug I filed talks about c++, because that's the closest analogy. > > But set_preferred_target_page_bits is called *much* later than a constructor. Though still before any use of the variable in question, for which we have an --enable-debug-tcg assertion. > It looks this issue can be avoided after "volatile" is applied to @target_page. However, I'm not sure if it's the correct fix to have. If it is, I can post a formal patch so that it can be included. --- a/exec-vary.c +++ b/exec-vary.c -extern const TargetPageBits target_page +extern const TargetPageBits volatile target_page __attribute__((alias("init_target_page"))); --- a/include/exec/cpu-all.h +++ b/include/exec/cpu-all.h -extern const TargetPageBits target_page; +extern const TargetPageBits volatile target_page; According to the experiments I did, it would be function call to set_preferred_target_page_bits() is dropped when the machine is created. The following c files are used in the experiment: --- a.c --- static int x; const extern int VOLATILE y __attribute__((alias("x"))); extern int read_y(void); void write_x(int val) { x = 1; } int main(void) { return read_y(); } --- b.c--- extern const int VOLATILE y; extern void write_x(int val); int read_y(void) { write_x(1); return y; } # gcc a.c b.c -O2 -flto=auto -DVOLATILE= -o a; ./a; echo $? 0 # gdb -nw ./a (gdb) disassem main Dump of assembler code for function main: 0x0000000000400480 <+0>: adrp x1, 0x420000 <__libc_start_main@got.plt> 0x0000000000400484 <+4>: mov w2, #0x1 // #1 0x0000000000400488 <+8>: mov w0, #0x0 // #0 0x000000000040048c <+12>: str w2, [x1, #32] 0x0000000000400490 <+16>: ret End of assembler dump. # gcc a.c b.c -O2 -flto=auto -DVOLATILE=volatile -o a; ./a; echo $? 1 # gdb -nw ./a (gdb) disassem main Dump of assembler code for function main: 0x0000000000400480 <+0>: adrp x1, 0x420000 <__libc_start_main@got.plt> 0x0000000000400484 <+4>: mov w2, #0x1 // #1 0x0000000000400488 <+8>: adrp x0, 0x420000 <__libc_start_main@got.plt> 0x000000000040048c <+12>: str w2, [x1, #32] 0x0000000000400490 <+16>: ldr w0, [x0, #32] 0x0000000000400494 <+20>: ret Thanks, Gavin
On 3/22/21 4:54 AM, Gavin Shan wrote: > It looks this issue can be avoided after "volatile" is applied to > @target_page. However, I'm not sure if it's the correct fix to have. Certainly not. That is the exact opposite of what we want. We want to minimize the number of reads from the variable, not maximize them. r~
Hi Richard, On 3/23/21 7:59 AM, Richard Henderson wrote: > On 3/22/21 4:54 AM, Gavin Shan wrote: >> It looks this issue can be avoided after "volatile" is applied to >> @target_page. However, I'm not sure if it's the correct fix to have. > > Certainly not. > > That is the exact opposite of what we want. We want to minimize the number of reads from the variable, not maximize them. > Yes, It's something I was thinking of. "volatile" can make @target_page visible to gcc, but maximizes the number of reads. By the way, your patch to use "-fno-lto" worked for me and it has been split into 3 patches by Phil. Richard, thanks for the quick fixup :) Thanks, Gavin
diff --git a/configure b/configure index f7d022a5db..8321f380d5 100755 --- a/configure +++ b/configure @@ -75,6 +75,7 @@ fi TMPB="qemu-conf" TMPC="${TMPDIR1}/${TMPB}.c" +TMPC_B="${TMPDIR1}/${TMPB}_b.c" TMPO="${TMPDIR1}/${TMPB}.o" TMPCXX="${TMPDIR1}/${TMPB}.cxx" TMPE="${TMPDIR1}/${TMPB}.exe" @@ -4878,13 +4879,38 @@ fi attralias=no cat > $TMPC << EOF -int x = 1; +static int x; extern const int y __attribute__((alias("x"))); -int main(void) { return 0; } +extern int read_y(void); +void write_x(int val); + +void write_x(int val) +{ + x = val; +} + +int main(void) +{ + return read_y(); +} EOF -if compile_prog "" "" ; then - attralias=yes +cat > $TMPC_B << EOF +extern const int y; +extern void write_x(int val); +int read_y(void); + +int read_y(void) +{ + write_x(1); + return y; +} +EOF + +TMPC+=" ${TMPC_B}" +if compile_prog "" "" && ! $TMPE; then + attralias=yes fi +TMPC="${TMPDIR1}/${TMPB}.c" ######################################## # check if getauxval is available.
It's still possible that the wrong value is returned from the alias of variable even if the program can be compiled without issue. This improves the check by executing the binary to check the result. If alias attribute can't be working properly, the @target_page in exec-vary.c will always return zeroes when we have the following gcc version. # gcc --version gcc (GCC) 11.0.0 20210210 (Red Hat 11.0.0-0) This abstracts the code from exec-vary.c and use it as indicator to enable gcc alias attribute or not. Signed-off-by: Gavin Shan <gshan@redhat.com> --- configure | 34 ++++++++++++++++++++++++++++++---- 1 file changed, 30 insertions(+), 4 deletions(-)