Message ID | efe6b039a544da8215d5e54aa7c4b6d1986fc2b0.1611607264.git.jpoimboe@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [RFC] gcc-plugins: Handle GCC version mismatch for OOT modules | expand |
On Tue, Jan 26, 2021 at 5:42 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote: > > When building out-of-tree kernel modules, the build system doesn't > require the GCC version to match the version used to build the original > kernel. That's probably [1] fine. > > In fact, for many distros, the version of GCC used to build the latest > kernel doesn't necessarily match the latest released GCC, so a GCC > mismatch turns out to be pretty common. And with CONFIG_MODVERSIONS > it's probably more common. > > So a lot of users have come to rely on being able to use a different > version of GCC when building OOT modules. > > But with GCC plugins enabled, that's no longer allowed: > > cc1: error: incompatible gcc/plugin versions > cc1: error: failed to initialize plugin ./scripts/gcc-plugins/structleak_plugin.so > > That error comes from the plugin's call to > plugin_default_version_check(), which strictly enforces the GCC version. > The strict check makes sense, because there's nothing to prevent the GCC > plugin ABI from changing -- and it often does. > > But failing the build isn't necessary. For most plugins, OOT modules > will otherwise work just fine without the plugin instrumentation. > > When a GCC version mismatch is detected, print a warning and disable the > plugin. The only exception is the RANDSTRUCT plugin which needs all > code to see the same struct layouts. In that case print an error. > > [1] Ignoring, for the moment, that the kernel now has > toolchain-dependent kconfig options, which can silently disable > features and cause havoc when compiler versions differ, or even when > certain libraries are missing. This is a separate problem which > also needs to be addressed. > > Reported-by: Ondrej Mosnacek <omosnace@redhat.com> > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> > --- We are based on the assumption that we use the same compiler for in-tree and out-of-tree. If people use a different compiler, they must be prepared for any possible problem. Using different compiler flags for in-tree and out-of-tree is even more dangerous. For example, CONFIG_GCC_PLUGIN_RANDSTRUCT is enabled for in-tree build, and then disabled for out-of-tree modules, the struct layout will mismatch, won't it? This patch is ugly, and not doing the right thing. > scripts/Makefile.gcc-plugins | 19 +++++++++++++++++++ > scripts/Makefile.kcov | 11 +++++++++++ > 2 files changed, 30 insertions(+) > > diff --git a/scripts/Makefile.gcc-plugins b/scripts/Makefile.gcc-plugins > index 952e46876329..7227692fba59 100644 > --- a/scripts/Makefile.gcc-plugins > +++ b/scripts/Makefile.gcc-plugins > @@ -51,6 +51,25 @@ export DISABLE_ARM_SSP_PER_TASK_PLUGIN > GCC_PLUGINS_CFLAGS := $(strip $(addprefix -fplugin=$(objtree)/scripts/gcc-plugins/, $(gcc-plugin-y)) $(gcc-plugin-cflags-y)) > # The sancov_plugin.so is included via CFLAGS_KCOV, so it is removed here. > GCC_PLUGINS_CFLAGS := $(filter-out %/sancov_plugin.so, $(GCC_PLUGINS_CFLAGS)) > + > +# Out-of-tree module check: If there's a GCC version mismatch, disable plugins > +# and print a warning. Otherwise the OOT module build will fail due to > +# plugin_default_version_check(). > +ifneq ($(GCC_PLUGINS_CFLAGS),) > + ifneq ($(KBUILD_EXTMOD),) > + ifneq ($(CONFIG_GCC_VERSION), $(shell $(srctree)/scripts/gcc-version.sh $(HOSTCXX))) > + > + ifdef CONFIG_GCC_PLUGIN_RANDSTRUCT > + $(error error: CONFIG_GCC_PLUGIN_RANDSTRUCT requires out-of-tree modules to be built using the same GCC version as the kernel.) > + endif > + > + $(warning warning: Disabling GCC plugins for out-of-tree modules due to GCC version mismatch.) > + $(warning warning: The following plugins have been disabled: $(gcc-plugin-y)) > + GCC_PLUGINS_CFLAGS := > + endif > + endif > +endif > + > export GCC_PLUGINS_CFLAGS > > # Add the flags to the build! > diff --git a/scripts/Makefile.kcov b/scripts/Makefile.kcov > index 67e8cfe3474b..63a2bc2aabb2 100644 > --- a/scripts/Makefile.kcov > +++ b/scripts/Makefile.kcov > @@ -3,4 +3,15 @@ kcov-flags-$(CONFIG_CC_HAS_SANCOV_TRACE_PC) += -fsanitize-coverage=trace-pc > kcov-flags-$(CONFIG_KCOV_ENABLE_COMPARISONS) += -fsanitize-coverage=trace-cmp > kcov-flags-$(CONFIG_GCC_PLUGIN_SANCOV) += -fplugin=$(objtree)/scripts/gcc-plugins/sancov_plugin.so > > +# Out-of-tree module check for GCC version mismatch. > +# See the similar check in scripts/Makefile.gcc-plugins > +ifneq ($(CONFIG_GCC_PLUGIN_SANCOV),) > + ifneq ($(KBUILD_EXTMOD),) > + ifneq ($(CONFIG_GCC_VERSION), $(shell $(srctree)/scripts/gcc-version.sh $(HOSTCXX))) > + $(warning warning: Disabling CONFIG_GCC_PLUGIN_SANCOV for out-of-tree modules due to GCC version mismatch.) > + kcov-flags-y := $(filter-out %/sancov_plugin.so, $(kcov-flags-y)) > + endif > + endif > +endif > + > export CFLAGS_KCOV := $(kcov-flags-y) > -- > 2.29.2 >
On Tue, Jan 26, 2021 at 06:16:01AM +0900, Masahiro Yamada wrote: > On Tue, Jan 26, 2021 at 5:42 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote: > > > > When building out-of-tree kernel modules, the build system doesn't > > require the GCC version to match the version used to build the original > > kernel. That's probably [1] fine. > > > > In fact, for many distros, the version of GCC used to build the latest > > kernel doesn't necessarily match the latest released GCC, so a GCC > > mismatch turns out to be pretty common. And with CONFIG_MODVERSIONS > > it's probably more common. > > > > So a lot of users have come to rely on being able to use a different > > version of GCC when building OOT modules. > > > > But with GCC plugins enabled, that's no longer allowed: > > > > cc1: error: incompatible gcc/plugin versions > > cc1: error: failed to initialize plugin ./scripts/gcc-plugins/structleak_plugin.so > > > > That error comes from the plugin's call to > > plugin_default_version_check(), which strictly enforces the GCC version. > > The strict check makes sense, because there's nothing to prevent the GCC > > plugin ABI from changing -- and it often does. > > > > But failing the build isn't necessary. For most plugins, OOT modules > > will otherwise work just fine without the plugin instrumentation. > > > > When a GCC version mismatch is detected, print a warning and disable the > > plugin. The only exception is the RANDSTRUCT plugin which needs all > > code to see the same struct layouts. In that case print an error. > > > > [1] Ignoring, for the moment, that the kernel now has > > toolchain-dependent kconfig options, which can silently disable > > features and cause havoc when compiler versions differ, or even when > > certain libraries are missing. This is a separate problem which > > also needs to be addressed. > > > > Reported-by: Ondrej Mosnacek <omosnace@redhat.com> > > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> > > --- > > > We are based on the assumption that we use the same > compiler for in-tree and out-of-tree. Sorry, but that assumption isn't based in reality. And it's not enforced. > If people use a different compiler, they must be > prepared for any possible problem. > > Using different compiler flags for in-tree and out-of-tree > is even more dangerous. > > For example, CONFIG_GCC_PLUGIN_RANDSTRUCT is enabled > for in-tree build, and then disabled for out-of-tree modules, > the struct layout will mismatch, won't it? If you read the patch you'll notice that it handles that case, when it's caused by GCC mismatch. However, as alluded to in the [1] footnote, it doesn't handle the case where the OOT build system doesn't have gcc-plugin-devel installed. Then CONFIG_GCC_PLUGIN_RANDSTRUCT gets silently disabled and the build succeeds! That happens even without a GCC mismatch. > This patch is ugly, and not doing the right thing. Maybe, but I doubt the solution is to make assumptions.
On Tue, Jan 26, 2021 at 6:28 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote: > > On Tue, Jan 26, 2021 at 06:16:01AM +0900, Masahiro Yamada wrote: > > On Tue, Jan 26, 2021 at 5:42 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote: > > > > > > When building out-of-tree kernel modules, the build system doesn't > > > require the GCC version to match the version used to build the original > > > kernel. That's probably [1] fine. > > > > > > In fact, for many distros, the version of GCC used to build the latest > > > kernel doesn't necessarily match the latest released GCC, so a GCC > > > mismatch turns out to be pretty common. And with CONFIG_MODVERSIONS > > > it's probably more common. > > > > > > So a lot of users have come to rely on being able to use a different > > > version of GCC when building OOT modules. > > > > > > But with GCC plugins enabled, that's no longer allowed: > > > > > > cc1: error: incompatible gcc/plugin versions > > > cc1: error: failed to initialize plugin ./scripts/gcc-plugins/structleak_plugin.so > > > > > > That error comes from the plugin's call to > > > plugin_default_version_check(), which strictly enforces the GCC version. > > > The strict check makes sense, because there's nothing to prevent the GCC > > > plugin ABI from changing -- and it often does. > > > > > > But failing the build isn't necessary. For most plugins, OOT modules > > > will otherwise work just fine without the plugin instrumentation. > > > > > > When a GCC version mismatch is detected, print a warning and disable the > > > plugin. The only exception is the RANDSTRUCT plugin which needs all > > > code to see the same struct layouts. In that case print an error. > > > > > > [1] Ignoring, for the moment, that the kernel now has > > > toolchain-dependent kconfig options, which can silently disable > > > features and cause havoc when compiler versions differ, or even when > > > certain libraries are missing. This is a separate problem which > > > also needs to be addressed. > > > > > > Reported-by: Ondrej Mosnacek <omosnace@redhat.com> > > > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> > > > --- > > > > > > We are based on the assumption that we use the same > > compiler for in-tree and out-of-tree. > > Sorry, but that assumption isn't based in reality. And it's not > enforced. > > > If people use a different compiler, they must be > > prepared for any possible problem. > > > > Using different compiler flags for in-tree and out-of-tree > > is even more dangerous. > > > > For example, CONFIG_GCC_PLUGIN_RANDSTRUCT is enabled > > for in-tree build, and then disabled for out-of-tree modules, > > the struct layout will mismatch, won't it? > > If you read the patch you'll notice that it handles that case, when it's > caused by GCC mismatch. > > However, as alluded to in the [1] footnote, it doesn't handle the case > where the OOT build system doesn't have gcc-plugin-devel installed. > Then CONFIG_GCC_PLUGIN_RANDSTRUCT gets silently disabled and the build > succeeds! That happens even without a GCC mismatch. Ah, sorry. I responded too early before reading the patch fully. But, I do not like to make RANDSTRUCT a special case. I'd rather want to stop building for any plugin. > > This patch is ugly, and not doing the right thing. > > Maybe, but I doubt the solution is to make assumptions. > > -- > Josh >
On Mon, Jan 25, 2021 at 02:42:10PM -0600, Josh Poimboeuf wrote: > When a GCC version mismatch is detected, print a warning and disable the > plugin. The only exception is the RANDSTRUCT plugin which needs all > code to see the same struct layouts. In that case print an error. I prefer this patch as-is: only randstruct needs a hard failure. The others likely work (in fact, randstruct likely works too). Masahiro, are you suggesting to be a hard-failure for all plugins?
On Tue, Jan 26, 2021 at 06:44:35AM +0900, Masahiro Yamada wrote: > > > If people use a different compiler, they must be > > > prepared for any possible problem. > > > > > > Using different compiler flags for in-tree and out-of-tree > > > is even more dangerous. > > > > > > For example, CONFIG_GCC_PLUGIN_RANDSTRUCT is enabled > > > for in-tree build, and then disabled for out-of-tree modules, > > > the struct layout will mismatch, won't it? > > > > If you read the patch you'll notice that it handles that case, when it's > > caused by GCC mismatch. > > > > However, as alluded to in the [1] footnote, it doesn't handle the case > > where the OOT build system doesn't have gcc-plugin-devel installed. > > Then CONFIG_GCC_PLUGIN_RANDSTRUCT gets silently disabled and the build > > succeeds! That happens even without a GCC mismatch. > > > Ah, sorry. > > I responded too early before reading the patch fully. > > But, I do not like to make RANDSTRUCT a special case. > > I'd rather want to stop building for any plugin. Other than RANDSTRUCT there doesn't seem to be any problem with disabling them (and printing a warning) in the OOT build. Why not give users that option? It's harmless, and will make distro's (and their users') lives easier. Either GCC mismatch is ok, or it's not. Let's not half-enforce it. Also, how do you propose we solve the other problem, where a missing optional library (gcc-plugin-devel) causes the OOT module build to silently disable the plugin? This is related to my earlier complaint about the dangers of toolchain-dependent kconfig options.
On Mon, Jan 25, 2021 at 02:03:07PM -0800, Kees Cook wrote: > On Mon, Jan 25, 2021 at 02:42:10PM -0600, Josh Poimboeuf wrote: > > When a GCC version mismatch is detected, print a warning and disable the > > plugin. The only exception is the RANDSTRUCT plugin which needs all > > code to see the same struct layouts. In that case print an error. > > I prefer this patch as-is: only randstruct needs a hard failure. The > others likely work (in fact, randstruct likely works too). I'm curious about this last statement, why would randstruct likely work? Even struct module has '__randomize_layout', wouldn't basic module init go splat?
On Tue, Jan 26, 2021 at 7:03 AM Kees Cook <keescook@chromium.org> wrote: > > On Mon, Jan 25, 2021 at 02:42:10PM -0600, Josh Poimboeuf wrote: > > When a GCC version mismatch is detected, print a warning and disable the > > plugin. The only exception is the RANDSTRUCT plugin which needs all > > code to see the same struct layouts. In that case print an error. > > I prefer this patch as-is: only randstruct needs a hard failure. The > others likely work (in fact, randstruct likely works too). > > Masahiro, are you suggesting to be a hard-failure for all plugins? Yes. I want to require "I swear to use the same compiler version for external modules" when you enable GCC plugins. config CC_VERSION_CHECK_FOR_EXTERNAL_MODULES bool "Check the compiler version before building external modules" help If this option is enabled, the compiler version is checked before building external modules. This ensures the same compiler is used for the kernel and external modules. config GCC_PLUGINS ... depends on CC_VERSION_CHECK_FOR_EXTERNAL_MODULES In Makefile, check the version for out-of-tree modules if CONFIG_CC_VERSION_CHECK_FOR_EXTERNAL_MODULES. There is no difference in the fact that you cannot use a different compiler for external modules if CONFIG_GCC_PLUGINS=y. We started with the assumption that modules must be compiled by the same compiler as the kernel was. https://lore.kernel.org/patchwork/patch/836247/#1031547 Now that the compiler capability is evaluated in Kconfig, this is a harder requirement. In reality, a different compiler might be used, and, this requirement might be loosened, but the same compiler should be required for CONFIG_GCC_PLUGINS.
On Mon, Jan 25, 2021 at 03:27:55PM -0600, Josh Poimboeuf wrote: > On Tue, Jan 26, 2021 at 06:16:01AM +0900, Masahiro Yamada wrote: > > On Tue, Jan 26, 2021 at 5:42 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote: > > > > > > When building out-of-tree kernel modules, the build system doesn't > > > require the GCC version to match the version used to build the original > > > kernel. That's probably [1] fine. > > > > > > In fact, for many distros, the version of GCC used to build the latest > > > kernel doesn't necessarily match the latest released GCC, so a GCC > > > mismatch turns out to be pretty common. And with CONFIG_MODVERSIONS > > > it's probably more common. > > > > > > So a lot of users have come to rely on being able to use a different > > > version of GCC when building OOT modules. > > > > > > But with GCC plugins enabled, that's no longer allowed: > > > > > > cc1: error: incompatible gcc/plugin versions > > > cc1: error: failed to initialize plugin ./scripts/gcc-plugins/structleak_plugin.so > > > > > > That error comes from the plugin's call to > > > plugin_default_version_check(), which strictly enforces the GCC version. > > > The strict check makes sense, because there's nothing to prevent the GCC > > > plugin ABI from changing -- and it often does. > > > > > > But failing the build isn't necessary. For most plugins, OOT modules > > > will otherwise work just fine without the plugin instrumentation. > > > > > > When a GCC version mismatch is detected, print a warning and disable the > > > plugin. The only exception is the RANDSTRUCT plugin which needs all > > > code to see the same struct layouts. In that case print an error. > > > > > > [1] Ignoring, for the moment, that the kernel now has > > > toolchain-dependent kconfig options, which can silently disable > > > features and cause havoc when compiler versions differ, or even when > > > certain libraries are missing. This is a separate problem which > > > also needs to be addressed. > > > > > > Reported-by: Ondrej Mosnacek <omosnace@redhat.com> > > > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> > > > --- > > > > > > We are based on the assumption that we use the same > > compiler for in-tree and out-of-tree. > > Sorry, but that assumption isn't based in reality. And it's not > enforced. It's "enforced" in that if something breaks because of this, no one will support it :) We have always said, "all kernel code must be built with the exact same compiler and with the same build options". Anyone who does anything different, is on their own. So please, let's not change things to make it as of this might work to hide real problems that are known to show up when people mix/match compilers with modules. thanks, greg k-h
On Mon, Jan 25, 2021 at 04:07:57PM -0600, Josh Poimboeuf wrote: > On Tue, Jan 26, 2021 at 06:44:35AM +0900, Masahiro Yamada wrote: > > > > If people use a different compiler, they must be > > > > prepared for any possible problem. > > > > > > > > Using different compiler flags for in-tree and out-of-tree > > > > is even more dangerous. > > > > > > > > For example, CONFIG_GCC_PLUGIN_RANDSTRUCT is enabled > > > > for in-tree build, and then disabled for out-of-tree modules, > > > > the struct layout will mismatch, won't it? > > > > > > If you read the patch you'll notice that it handles that case, when it's > > > caused by GCC mismatch. > > > > > > However, as alluded to in the [1] footnote, it doesn't handle the case > > > where the OOT build system doesn't have gcc-plugin-devel installed. > > > Then CONFIG_GCC_PLUGIN_RANDSTRUCT gets silently disabled and the build > > > succeeds! That happens even without a GCC mismatch. > > > > > > Ah, sorry. > > > > I responded too early before reading the patch fully. > > > > But, I do not like to make RANDSTRUCT a special case. > > > > I'd rather want to stop building for any plugin. > > Other than RANDSTRUCT there doesn't seem to be any problem with > disabling them (and printing a warning) in the OOT build. Why not give > users that option? It's harmless, and will make distro's (and their > users') lives easier. > > Either GCC mismatch is ok, or it's not. Let's not half-enforce it. As I said earlier, it's not ok, we can not support it at all. thanks, greg k-h
> We started with the assumption that modules must be compiled > by the same compiler as the kernel was. > https://lore.kernel.org/patchwork/patch/836247/#1031547 > > Now that the compiler capability is evaluated in Kconfig, > this is a harder requirement. > > In reality, a different compiler might be used, > and, this requirement might be loosened, but > the same compiler should be required for CONFIG_GCC_PLUGINS. Hmmmm. If I'm building a module to load into a distro kernel it is very unlikely that I'll actually have the same version of the compiler installed as that used to build the kernel. Additionally some of the .o files (that don't refer to any kernel headers) may have been compiled with an entirely different compiler altogether. Whether any plugins are actually installed is another problem. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On Tue, Jan 26, 2021 at 2:21 AM Greg KH <greg@kroah.com> wrote: > > On Mon, Jan 25, 2021 at 04:07:57PM -0600, Josh Poimboeuf wrote: > > On Tue, Jan 26, 2021 at 06:44:35AM +0900, Masahiro Yamada wrote: > > > > > If people use a different compiler, they must be > > > > > prepared for any possible problem. > > > > > > > > > > Using different compiler flags for in-tree and out-of-tree > > > > > is even more dangerous. > > > > > > > > > > For example, CONFIG_GCC_PLUGIN_RANDSTRUCT is enabled > > > > > for in-tree build, and then disabled for out-of-tree modules, > > > > > the struct layout will mismatch, won't it? > > > > > > > > If you read the patch you'll notice that it handles that case, when it's > > > > caused by GCC mismatch. > > > > > > > > However, as alluded to in the [1] footnote, it doesn't handle the case > > > > where the OOT build system doesn't have gcc-plugin-devel installed. > > > > Then CONFIG_GCC_PLUGIN_RANDSTRUCT gets silently disabled and the build > > > > succeeds! That happens even without a GCC mismatch. > > > > > > > > > Ah, sorry. > > > > > > I responded too early before reading the patch fully. > > > > > > But, I do not like to make RANDSTRUCT a special case. > > > > > > I'd rather want to stop building for any plugin. > > > > Other than RANDSTRUCT there doesn't seem to be any problem with > > disabling them (and printing a warning) in the OOT build. Why not give > > users that option? It's harmless, and will make distro's (and their > > users') lives easier. > > > > Either GCC mismatch is ok, or it's not. Let's not half-enforce it. > > As I said earlier, it's not ok, we can not support it at all. > Support and enforce are 2 completely different things. To shed a bit more light on this, the real issue that prompted this was breaking CI systems. As we enabled gcc plugins in Fedora, and the toolchain folks went through 3 different snapshots of gcc 11 in a week. Any CI process that built an out of tree module failed. I don't think this is nearly as much of a concern for stable distros, as it is for CI in development cycles. Justin
On Tue, Jan 26, 2021 at 06:44:44AM -0600, Justin Forbes wrote: > On Tue, Jan 26, 2021 at 2:21 AM Greg KH <greg@kroah.com> wrote: > > > > On Mon, Jan 25, 2021 at 04:07:57PM -0600, Josh Poimboeuf wrote: > > > On Tue, Jan 26, 2021 at 06:44:35AM +0900, Masahiro Yamada wrote: > > > > > > If people use a different compiler, they must be > > > > > > prepared for any possible problem. > > > > > > > > > > > > Using different compiler flags for in-tree and out-of-tree > > > > > > is even more dangerous. > > > > > > > > > > > > For example, CONFIG_GCC_PLUGIN_RANDSTRUCT is enabled > > > > > > for in-tree build, and then disabled for out-of-tree modules, > > > > > > the struct layout will mismatch, won't it? > > > > > > > > > > If you read the patch you'll notice that it handles that case, when it's > > > > > caused by GCC mismatch. > > > > > > > > > > However, as alluded to in the [1] footnote, it doesn't handle the case > > > > > where the OOT build system doesn't have gcc-plugin-devel installed. > > > > > Then CONFIG_GCC_PLUGIN_RANDSTRUCT gets silently disabled and the build > > > > > succeeds! That happens even without a GCC mismatch. > > > > > > > > > > > > Ah, sorry. > > > > > > > > I responded too early before reading the patch fully. > > > > > > > > But, I do not like to make RANDSTRUCT a special case. > > > > > > > > I'd rather want to stop building for any plugin. > > > > > > Other than RANDSTRUCT there doesn't seem to be any problem with > > > disabling them (and printing a warning) in the OOT build. Why not give > > > users that option? It's harmless, and will make distro's (and their > > > users') lives easier. > > > > > > Either GCC mismatch is ok, or it's not. Let's not half-enforce it. > > > > As I said earlier, it's not ok, we can not support it at all. > > > > Support and enforce are 2 completely different things. To shed a bit > more light on this, the real issue that prompted this was breaking CI > systems. As we enabled gcc plugins in Fedora, and the toolchain folks > went through 3 different snapshots of gcc 11 in a week. Any CI process > that built an out of tree module failed. I don't think this is nearly > as much of a concern for stable distros, as it is for CI in > development cycles. It's better to have an obvious break like this than to silently accept things and then have a much harder issue to debug at runtime, right? Don't allow things that we know we will not support, this sounds like an issue with your CI systems, not with the kernel build system, why not just fix that? :) thnaks, greg k-h
On Tue, Jan 26, 2021 at 02:51:29PM +0100, Greg KH wrote: > On Tue, Jan 26, 2021 at 06:44:44AM -0600, Justin Forbes wrote: > > On Tue, Jan 26, 2021 at 2:21 AM Greg KH <greg@kroah.com> wrote: > > > > > > On Mon, Jan 25, 2021 at 04:07:57PM -0600, Josh Poimboeuf wrote: > > > > On Tue, Jan 26, 2021 at 06:44:35AM +0900, Masahiro Yamada wrote: > > > > > > > If people use a different compiler, they must be > > > > > > > prepared for any possible problem. > > > > > > > > > > > > > > Using different compiler flags for in-tree and out-of-tree > > > > > > > is even more dangerous. > > > > > > > > > > > > > > For example, CONFIG_GCC_PLUGIN_RANDSTRUCT is enabled > > > > > > > for in-tree build, and then disabled for out-of-tree modules, > > > > > > > the struct layout will mismatch, won't it? > > > > > > > > > > > > If you read the patch you'll notice that it handles that case, when it's > > > > > > caused by GCC mismatch. > > > > > > > > > > > > However, as alluded to in the [1] footnote, it doesn't handle the case > > > > > > where the OOT build system doesn't have gcc-plugin-devel installed. > > > > > > Then CONFIG_GCC_PLUGIN_RANDSTRUCT gets silently disabled and the build > > > > > > succeeds! That happens even without a GCC mismatch. > > > > > > > > > > > > > > > Ah, sorry. > > > > > > > > > > I responded too early before reading the patch fully. > > > > > > > > > > But, I do not like to make RANDSTRUCT a special case. > > > > > > > > > > I'd rather want to stop building for any plugin. > > > > > > > > Other than RANDSTRUCT there doesn't seem to be any problem with > > > > disabling them (and printing a warning) in the OOT build. Why not give > > > > users that option? It's harmless, and will make distro's (and their > > > > users') lives easier. > > > > > > > > Either GCC mismatch is ok, or it's not. Let's not half-enforce it. > > > > > > As I said earlier, it's not ok, we can not support it at all. > > > > > > > Support and enforce are 2 completely different things. To shed a bit > > more light on this, the real issue that prompted this was breaking CI > > systems. As we enabled gcc plugins in Fedora, and the toolchain folks > > went through 3 different snapshots of gcc 11 in a week. Any CI process > > that built an out of tree module failed. I don't think this is nearly > > as much of a concern for stable distros, as it is for CI in > > development cycles. > > It's better to have an obvious break like this than to silently accept > things and then have a much harder issue to debug at runtime, right? User space mixes compiler versions all the time. The C ABI is stable. What specifically is the harder issue you're referring to?
On Tue, Jan 26, 2021 at 08:51:55AM -0600, Josh Poimboeuf wrote: > On Tue, Jan 26, 2021 at 02:51:29PM +0100, Greg KH wrote: > > On Tue, Jan 26, 2021 at 06:44:44AM -0600, Justin Forbes wrote: > > > On Tue, Jan 26, 2021 at 2:21 AM Greg KH <greg@kroah.com> wrote: > > > > > > > > On Mon, Jan 25, 2021 at 04:07:57PM -0600, Josh Poimboeuf wrote: > > > > > On Tue, Jan 26, 2021 at 06:44:35AM +0900, Masahiro Yamada wrote: > > > > > > > > If people use a different compiler, they must be > > > > > > > > prepared for any possible problem. > > > > > > > > > > > > > > > > Using different compiler flags for in-tree and out-of-tree > > > > > > > > is even more dangerous. > > > > > > > > > > > > > > > > For example, CONFIG_GCC_PLUGIN_RANDSTRUCT is enabled > > > > > > > > for in-tree build, and then disabled for out-of-tree modules, > > > > > > > > the struct layout will mismatch, won't it? > > > > > > > > > > > > > > If you read the patch you'll notice that it handles that case, when it's > > > > > > > caused by GCC mismatch. > > > > > > > > > > > > > > However, as alluded to in the [1] footnote, it doesn't handle the case > > > > > > > where the OOT build system doesn't have gcc-plugin-devel installed. > > > > > > > Then CONFIG_GCC_PLUGIN_RANDSTRUCT gets silently disabled and the build > > > > > > > succeeds! That happens even without a GCC mismatch. > > > > > > > > > > > > > > > > > > Ah, sorry. > > > > > > > > > > > > I responded too early before reading the patch fully. > > > > > > > > > > > > But, I do not like to make RANDSTRUCT a special case. > > > > > > > > > > > > I'd rather want to stop building for any plugin. > > > > > > > > > > Other than RANDSTRUCT there doesn't seem to be any problem with > > > > > disabling them (and printing a warning) in the OOT build. Why not give > > > > > users that option? It's harmless, and will make distro's (and their > > > > > users') lives easier. > > > > > > > > > > Either GCC mismatch is ok, or it's not. Let's not half-enforce it. > > > > > > > > As I said earlier, it's not ok, we can not support it at all. > > > > > > > > > > Support and enforce are 2 completely different things. To shed a bit > > > more light on this, the real issue that prompted this was breaking CI > > > systems. As we enabled gcc plugins in Fedora, and the toolchain folks > > > went through 3 different snapshots of gcc 11 in a week. Any CI process > > > that built an out of tree module failed. I don't think this is nearly > > > as much of a concern for stable distros, as it is for CI in > > > development cycles. > > > > It's better to have an obvious break like this than to silently accept > > things and then have a much harder issue to debug at runtime, right? > > User space mixes compiler versions all the time. The C ABI is stable. > > What specifically is the harder issue you're referring to? Have you not noticed include/linux/compiler.h and all of the different changes/workarounds we do for different versions of gcc/clang/intel compilers? We have never guaranteed that a kernel module would work that was built with a different compiler than the main kernel, and I doubt we can start now. thanks, greg k-h
On Tue, Jan 26, 2021 at 08:51:55AM -0600, Josh Poimboeuf wrote: > User space mixes compiler versions all the time. The C ABI is stable. > > What specifically is the harder issue you're referring to? I don't think the C ABI captures nearly enough. Imagine trying to mix a compiler with and without asm-goto support (ok, we fail to build without by now, but just imagine). No C ABI violated, but having that GCC extention vs not having it radically changes the kernel ABI. I think I'm with Greg here, just don't do it.
On Tue, Jan 26, 2021 at 04:15:37PM +0100, Peter Zijlstra wrote: > On Tue, Jan 26, 2021 at 08:51:55AM -0600, Josh Poimboeuf wrote: > > User space mixes compiler versions all the time. The C ABI is stable. > > > > What specifically is the harder issue you're referring to? > > I don't think the C ABI captures nearly enough. Imagine trying to mix a > compiler with and without asm-goto support (ok, we fail to build without > by now, but just imagine). > > No C ABI violated, but having that GCC extention vs not having it > radically changes the kernel ABI. > > I think I'm with Greg here, just don't do it. Ok, thank you for an actual example. asm goto is a good one. But it's not a cut-and-dry issue. Otherwise how could modversions possibly work? So yes, we should enforce GCC versions, but I still haven't seen a reason it should be more than just "same compiler and *major* version".
On Tue, Jan 26, 2021 at 09:46:51AM -0600, Josh Poimboeuf wrote: > On Tue, Jan 26, 2021 at 04:15:37PM +0100, Peter Zijlstra wrote: > > On Tue, Jan 26, 2021 at 08:51:55AM -0600, Josh Poimboeuf wrote: > > > User space mixes compiler versions all the time. The C ABI is stable. > > > > > > What specifically is the harder issue you're referring to? > > > > I don't think the C ABI captures nearly enough. Imagine trying to mix a > > compiler with and without asm-goto support (ok, we fail to build without > > by now, but just imagine). > > > > No C ABI violated, but having that GCC extention vs not having it > > radically changes the kernel ABI. > > > > I think I'm with Greg here, just don't do it. > > Ok, thank you for an actual example. asm goto is a good one. > > But it's not a cut-and-dry issue. Otherwise how could modversions > possibly work? > > So yes, we should enforce GCC versions, but I still haven't seen a > reason it should be more than just "same compiler and *major* version". Why bother? rebuilding the kernel and all modules is a matter of 10 minutes at most on a decently beefy build box. What actual problem are we trying to solve here?
On Tue, Jan 26, 2021 at 10:05 AM Peter Zijlstra <peterz@infradead.org> wrote: > > On Tue, Jan 26, 2021 at 09:46:51AM -0600, Josh Poimboeuf wrote: > > On Tue, Jan 26, 2021 at 04:15:37PM +0100, Peter Zijlstra wrote: > > > On Tue, Jan 26, 2021 at 08:51:55AM -0600, Josh Poimboeuf wrote: > > > > User space mixes compiler versions all the time. The C ABI is stable. > > > > > > > > What specifically is the harder issue you're referring to? > > > > > > I don't think the C ABI captures nearly enough. Imagine trying to mix a > > > compiler with and without asm-goto support (ok, we fail to build without > > > by now, but just imagine). > > > > > > No C ABI violated, but having that GCC extention vs not having it > > > radically changes the kernel ABI. > > > > > > I think I'm with Greg here, just don't do it. > > > > Ok, thank you for an actual example. asm goto is a good one. > > > > But it's not a cut-and-dry issue. Otherwise how could modversions > > possibly work? > > > > So yes, we should enforce GCC versions, but I still haven't seen a > > reason it should be more than just "same compiler and *major* version". > > Why bother? rebuilding the kernel and all modules is a matter of 10 > minutes at most on a decently beefy build box. > > What actual problem are we trying to solve here? This is true for those of us used to working with source and building by hand. For users who want everything packaged, rebuilding a kernel package for install is considerably longer, and for distros providing builds for multiple arches, we are looking at a couple of hours at best. From a Fedora standpoint, I am perfectly fine with it failing if someone tries to build a module on gcc10 when the kernel was built with gcc11. It's tedious when the kernel was built with gcc11 yesterday, and a new gcc11 build today means that kernel needs to be rebuilt.
On Tue, Jan 26, 2021 at 10:15:52AM -0600, Justin Forbes wrote: > On Tue, Jan 26, 2021 at 10:05 AM Peter Zijlstra <peterz@infradead.org> wrote: > > > > On Tue, Jan 26, 2021 at 09:46:51AM -0600, Josh Poimboeuf wrote: > > > On Tue, Jan 26, 2021 at 04:15:37PM +0100, Peter Zijlstra wrote: > > > > On Tue, Jan 26, 2021 at 08:51:55AM -0600, Josh Poimboeuf wrote: > > > > > User space mixes compiler versions all the time. The C ABI is stable. > > > > > > > > > > What specifically is the harder issue you're referring to? > > > > > > > > I don't think the C ABI captures nearly enough. Imagine trying to mix a > > > > compiler with and without asm-goto support (ok, we fail to build without > > > > by now, but just imagine). > > > > > > > > No C ABI violated, but having that GCC extention vs not having it > > > > radically changes the kernel ABI. > > > > > > > > I think I'm with Greg here, just don't do it. > > > > > > Ok, thank you for an actual example. asm goto is a good one. > > > > > > But it's not a cut-and-dry issue. Otherwise how could modversions > > > possibly work? > > > > > > So yes, we should enforce GCC versions, but I still haven't seen a > > > reason it should be more than just "same compiler and *major* version". > > > > Why bother? rebuilding the kernel and all modules is a matter of 10 > > minutes at most on a decently beefy build box. > > > > What actual problem are we trying to solve here? > > This is true for those of us used to working with source and building > by hand. For users who want everything packaged, rebuilding a kernel > package for install is considerably longer, and for distros providing > builds for multiple arches, we are looking at a couple of hours at > best. From a Fedora standpoint, I am perfectly fine with it failing > if someone tries to build a module on gcc10 when the kernel was built > with gcc11. It's tedious when the kernel was built with gcc11 > yesterday, and a new gcc11 build today means that kernel needs to be > rebuilt. Right. It's a problem for distro users. The compiler and kernel are in separate packages, with separate release cadences. The latest compiler version may not exactly match what was used to build the latest kernel.
From: Peter Zijlstra <peterz@infradead.org> > Sent: 26 January 2021 16:05 > > On Tue, Jan 26, 2021 at 09:46:51AM -0600, Josh Poimboeuf wrote: > > On Tue, Jan 26, 2021 at 04:15:37PM +0100, Peter Zijlstra wrote: > > > On Tue, Jan 26, 2021 at 08:51:55AM -0600, Josh Poimboeuf wrote: > > > > User space mixes compiler versions all the time. The C ABI is stable. > > > > > > > > What specifically is the harder issue you're referring to? > > > > > > I don't think the C ABI captures nearly enough. Imagine trying to mix a > > > compiler with and without asm-goto support (ok, we fail to build without > > > by now, but just imagine). > > > > > > No C ABI violated, but having that GCC extention vs not having it > > > radically changes the kernel ABI. > > > > > > I think I'm with Greg here, just don't do it. > > > > Ok, thank you for an actual example. asm goto is a good one. > > > > But it's not a cut-and-dry issue. Otherwise how could modversions > > possibly work? > > > > So yes, we should enforce GCC versions, but I still haven't seen a > > reason it should be more than just "same compiler and *major* version". > > Why bother? rebuilding the kernel and all modules is a matter of 10 > minutes at most on a decently beefy build box. > > What actual problem are we trying to solve here? People build modules to load into disrto-provided kernels. I'd have though the compiler would need to support the same options as that used to build the kernel - but not necessarily be exactly the same version. Ignoring compiler bugs (which will bight you if the compiler you have is broken) then a newer compiler ought to be fine. Or a kernel might have been built with config options that don't require features of the compiler being used - so that modules can be built with an older compiler. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On Tue, Jan 26, 2021 at 10:19:34AM -0600, Josh Poimboeuf wrote: > On Tue, Jan 26, 2021 at 10:15:52AM -0600, Justin Forbes wrote: > > On Tue, Jan 26, 2021 at 10:05 AM Peter Zijlstra <peterz@infradead.org> wrote: > > > > > > On Tue, Jan 26, 2021 at 09:46:51AM -0600, Josh Poimboeuf wrote: > > > > On Tue, Jan 26, 2021 at 04:15:37PM +0100, Peter Zijlstra wrote: > > > > > On Tue, Jan 26, 2021 at 08:51:55AM -0600, Josh Poimboeuf wrote: > > > > > > User space mixes compiler versions all the time. The C ABI is stable. > > > > > > > > > > > > What specifically is the harder issue you're referring to? > > > > > > > > > > I don't think the C ABI captures nearly enough. Imagine trying to mix a > > > > > compiler with and without asm-goto support (ok, we fail to build without > > > > > by now, but just imagine). > > > > > > > > > > No C ABI violated, but having that GCC extention vs not having it > > > > > radically changes the kernel ABI. > > > > > > > > > > I think I'm with Greg here, just don't do it. > > > > > > > > Ok, thank you for an actual example. asm goto is a good one. > > > > > > > > But it's not a cut-and-dry issue. Otherwise how could modversions > > > > possibly work? > > > > > > > > So yes, we should enforce GCC versions, but I still haven't seen a > > > > reason it should be more than just "same compiler and *major* version". > > > > > > Why bother? rebuilding the kernel and all modules is a matter of 10 > > > minutes at most on a decently beefy build box. > > > > > > What actual problem are we trying to solve here? > > > > This is true for those of us used to working with source and building > > by hand. For users who want everything packaged, rebuilding a kernel > > package for install is considerably longer, and for distros providing > > builds for multiple arches, we are looking at a couple of hours at > > best. From a Fedora standpoint, I am perfectly fine with it failing > > if someone tries to build a module on gcc10 when the kernel was built > > with gcc11. It's tedious when the kernel was built with gcc11 > > yesterday, and a new gcc11 build today means that kernel needs to be > > rebuilt. > > Right. It's a problem for distro users. The compiler and kernel are in > separate packages, with separate release cadences. The latest compiler > version may not exactly match what was used to build the latest kernel. Given that distros _should_ be updating their kernel faster than the compiler updates, what's the real issue here? You build a kernel, and all external modules, at the same time. If you want to build them at different times, you make your build system ensure they were the same compiler so that you are "bug compatible". And yes, it might be a pain if gcc11 gets updated every other day, but as someone living with a "rolling-distro" you get used to it, otherwise you just "pin" the build tools and keep that from happening. This isn't a new thing, we've been doing this for decades, why is this surprising? thanks, greg k-h
On Tue, Jan 26, 2021 at 11:07 AM Greg KH <greg@kroah.com> wrote: > > On Tue, Jan 26, 2021 at 10:19:34AM -0600, Josh Poimboeuf wrote: > > On Tue, Jan 26, 2021 at 10:15:52AM -0600, Justin Forbes wrote: > > > On Tue, Jan 26, 2021 at 10:05 AM Peter Zijlstra <peterz@infradead.org> wrote: > > > > > > > > On Tue, Jan 26, 2021 at 09:46:51AM -0600, Josh Poimboeuf wrote: > > > > > On Tue, Jan 26, 2021 at 04:15:37PM +0100, Peter Zijlstra wrote: > > > > > > On Tue, Jan 26, 2021 at 08:51:55AM -0600, Josh Poimboeuf wrote: > > > > > > > User space mixes compiler versions all the time. The C ABI is stable. > > > > > > > > > > > > > > What specifically is the harder issue you're referring to? > > > > > > > > > > > > I don't think the C ABI captures nearly enough. Imagine trying to mix a > > > > > > compiler with and without asm-goto support (ok, we fail to build without > > > > > > by now, but just imagine). > > > > > > > > > > > > No C ABI violated, but having that GCC extention vs not having it > > > > > > radically changes the kernel ABI. > > > > > > > > > > > > I think I'm with Greg here, just don't do it. > > > > > > > > > > Ok, thank you for an actual example. asm goto is a good one. > > > > > > > > > > But it's not a cut-and-dry issue. Otherwise how could modversions > > > > > possibly work? > > > > > > > > > > So yes, we should enforce GCC versions, but I still haven't seen a > > > > > reason it should be more than just "same compiler and *major* version". > > > > > > > > Why bother? rebuilding the kernel and all modules is a matter of 10 > > > > minutes at most on a decently beefy build box. > > > > > > > > What actual problem are we trying to solve here? > > > > > > This is true for those of us used to working with source and building > > > by hand. For users who want everything packaged, rebuilding a kernel > > > package for install is considerably longer, and for distros providing > > > builds for multiple arches, we are looking at a couple of hours at > > > best. From a Fedora standpoint, I am perfectly fine with it failing > > > if someone tries to build a module on gcc10 when the kernel was built > > > with gcc11. It's tedious when the kernel was built with gcc11 > > > yesterday, and a new gcc11 build today means that kernel needs to be > > > rebuilt. > > > > Right. It's a problem for distro users. The compiler and kernel are in > > separate packages, with separate release cadences. The latest compiler > > version may not exactly match what was used to build the latest kernel. > > Given that distros _should_ be updating their kernel faster than the > compiler updates, what's the real issue here? You build a kernel, and > all external modules, at the same time. If you want to build them at > different times, you make your build system ensure they were the same > compiler so that you are "bug compatible". > > And yes, it might be a pain if gcc11 gets updated every other day, but > as someone living with a "rolling-distro" you get used to it, otherwise > you just "pin" the build tools and keep that from happening. > > This isn't a new thing, we've been doing this for decades, why is this > surprising? We definitely build considerably more kernels than toolchains. From a rawhide standpoint though, a number of testers are willing to test RC releases, but are not willing to run debug kernels, so they installed rc4 yesterday, but will not install today's snapshot. I will build 3-5 new kernels before they update to rc5. We have been doing things this way for over a decade. It has never been a problem until we turned on CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF_ALL. Suddenly I am getting complaints.
On Mon, Jan 25, 2021 at 04:19:53PM -0600, Josh Poimboeuf wrote: > On Mon, Jan 25, 2021 at 02:03:07PM -0800, Kees Cook wrote: > > On Mon, Jan 25, 2021 at 02:42:10PM -0600, Josh Poimboeuf wrote: > > > When a GCC version mismatch is detected, print a warning and disable the > > > plugin. The only exception is the RANDSTRUCT plugin which needs all > > > code to see the same struct layouts. In that case print an error. > > > > I prefer this patch as-is: only randstruct needs a hard failure. The > > others likely work (in fact, randstruct likely works too). > > I'm curious about this last statement, why would randstruct likely work? > > Even struct module has '__randomize_layout', wouldn't basic module init > go splat? No; the seed is part of the generate includes -- you'll get the same layout with the same seed.
On Tue, Jan 26, 2021 at 09:56:10AM -0800, Kees Cook wrote: > On Mon, Jan 25, 2021 at 04:19:53PM -0600, Josh Poimboeuf wrote: > > On Mon, Jan 25, 2021 at 02:03:07PM -0800, Kees Cook wrote: > > > On Mon, Jan 25, 2021 at 02:42:10PM -0600, Josh Poimboeuf wrote: > > > > When a GCC version mismatch is detected, print a warning and disable the > > > > plugin. The only exception is the RANDSTRUCT plugin which needs all > > > > code to see the same struct layouts. In that case print an error. > > > > > > I prefer this patch as-is: only randstruct needs a hard failure. The > > > others likely work (in fact, randstruct likely works too). > > > > I'm curious about this last statement, why would randstruct likely work? > > > > Even struct module has '__randomize_layout', wouldn't basic module init > > go splat? > > No; the seed is part of the generate includes -- you'll get the same > layout with the same seed. Right, but don't you need the plugin enabled to make use of that seed, so the structs get interpreted properly by the module? Or am I completely misunderstanding how this plugin works?
On Tue, Jan 26, 2021 at 12:43:16PM -0600, Josh Poimboeuf wrote: > On Tue, Jan 26, 2021 at 09:56:10AM -0800, Kees Cook wrote: > > On Mon, Jan 25, 2021 at 04:19:53PM -0600, Josh Poimboeuf wrote: > > > On Mon, Jan 25, 2021 at 02:03:07PM -0800, Kees Cook wrote: > > > > On Mon, Jan 25, 2021 at 02:42:10PM -0600, Josh Poimboeuf wrote: > > > > > When a GCC version mismatch is detected, print a warning and disable the > > > > > plugin. The only exception is the RANDSTRUCT plugin which needs all > > > > > code to see the same struct layouts. In that case print an error. > > > > > > > > I prefer this patch as-is: only randstruct needs a hard failure. The > > > > others likely work (in fact, randstruct likely works too). > > > > > > I'm curious about this last statement, why would randstruct likely work? > > > > > > Even struct module has '__randomize_layout', wouldn't basic module init > > > go splat? > > > > No; the seed is part of the generate includes -- you'll get the same > > layout with the same seed. > > Right, but don't you need the plugin enabled to make use of that seed, > so the structs get interpreted properly by the module? Or am I > completely misunderstanding how this plugin works? Having the plugin enabled or not is part of the Kconfig ... you can't build anything if you change Kconfig. I feel like I'm missing something...
On Tue, Jan 26, 2021 at 02:59:57PM -0800, Kees Cook wrote: > On Tue, Jan 26, 2021 at 12:43:16PM -0600, Josh Poimboeuf wrote: > > On Tue, Jan 26, 2021 at 09:56:10AM -0800, Kees Cook wrote: > > > On Mon, Jan 25, 2021 at 04:19:53PM -0600, Josh Poimboeuf wrote: > > > > On Mon, Jan 25, 2021 at 02:03:07PM -0800, Kees Cook wrote: > > > > > On Mon, Jan 25, 2021 at 02:42:10PM -0600, Josh Poimboeuf wrote: > > > > > > When a GCC version mismatch is detected, print a warning and disable the > > > > > > plugin. The only exception is the RANDSTRUCT plugin which needs all > > > > > > code to see the same struct layouts. In that case print an error. > > > > > > > > > > I prefer this patch as-is: only randstruct needs a hard failure. The > > > > > others likely work (in fact, randstruct likely works too). > > > > > > > > I'm curious about this last statement, why would randstruct likely work? > > > > > > > > Even struct module has '__randomize_layout', wouldn't basic module init > > > > go splat? > > > > > > No; the seed is part of the generate includes -- you'll get the same > > > layout with the same seed. > > > > Right, but don't you need the plugin enabled to make use of that seed, > > so the structs get interpreted properly by the module? Or am I > > completely misunderstanding how this plugin works? > > Having the plugin enabled or not is part of the Kconfig ... you can't > build anything if you change Kconfig. I feel like I'm missing > something... I guess we crossed wires somehow. Backing up :-) The patch disables plugins when there's a GCC mismatch in the OOT module build, with the exception of RANDSTRUCT, for which it just errors out. When you said "randstruct likely works too" I thought you meant that RANDSTRUCT would likely work even if it were disabled in the OOT module build (i.e. if we removed the RANDSTRUCT special case from the patch). Or did you mean something else? Like using RANDSTRUCT with a different version of GCC would likely work? (I'm definitely not proposing we allow GCC mismatches for plugins, as I was told that plugins can break from one build to the next).
Please don't add all this garbage. We only add infrastructure to the kernel for what the kernel itself needs, not for weird out of tree infrastructure. On Mon, Jan 25, 2021 at 02:42:10PM -0600, Josh Poimboeuf wrote: > When building out-of-tree kernel modules, the build system doesn't > require the GCC version to match the version used to build the original > kernel. That's probably [1] fine. > > In fact, for many distros, the version of GCC used to build the latest > kernel doesn't necessarily match the latest released GCC, so a GCC > mismatch turns out to be pretty common. And with CONFIG_MODVERSIONS > it's probably more common. > > So a lot of users have come to rely on being able to use a different > version of GCC when building OOT modules. > > But with GCC plugins enabled, that's no longer allowed: > > cc1: error: incompatible gcc/plugin versions > cc1: error: failed to initialize plugin ./scripts/gcc-plugins/structleak_plugin.so > > That error comes from the plugin's call to > plugin_default_version_check(), which strictly enforces the GCC version. > The strict check makes sense, because there's nothing to prevent the GCC > plugin ABI from changing -- and it often does. > > But failing the build isn't necessary. For most plugins, OOT modules > will otherwise work just fine without the plugin instrumentation. > > When a GCC version mismatch is detected, print a warning and disable the > plugin. The only exception is the RANDSTRUCT plugin which needs all > code to see the same struct layouts. In that case print an error. > > [1] Ignoring, for the moment, that the kernel now has > toolchain-dependent kconfig options, which can silently disable > features and cause havoc when compiler versions differ, or even when > certain libraries are missing. This is a separate problem which > also needs to be addressed. > > Reported-by: Ondrej Mosnacek <omosnace@redhat.com> > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> > --- > scripts/Makefile.gcc-plugins | 19 +++++++++++++++++++ > scripts/Makefile.kcov | 11 +++++++++++ > 2 files changed, 30 insertions(+) > > diff --git a/scripts/Makefile.gcc-plugins b/scripts/Makefile.gcc-plugins > index 952e46876329..7227692fba59 100644 > --- a/scripts/Makefile.gcc-plugins > +++ b/scripts/Makefile.gcc-plugins > @@ -51,6 +51,25 @@ export DISABLE_ARM_SSP_PER_TASK_PLUGIN > GCC_PLUGINS_CFLAGS := $(strip $(addprefix -fplugin=$(objtree)/scripts/gcc-plugins/, $(gcc-plugin-y)) $(gcc-plugin-cflags-y)) > # The sancov_plugin.so is included via CFLAGS_KCOV, so it is removed here. > GCC_PLUGINS_CFLAGS := $(filter-out %/sancov_plugin.so, $(GCC_PLUGINS_CFLAGS)) > + > +# Out-of-tree module check: If there's a GCC version mismatch, disable plugins > +# and print a warning. Otherwise the OOT module build will fail due to > +# plugin_default_version_check(). > +ifneq ($(GCC_PLUGINS_CFLAGS),) > + ifneq ($(KBUILD_EXTMOD),) > + ifneq ($(CONFIG_GCC_VERSION), $(shell $(srctree)/scripts/gcc-version.sh $(HOSTCXX))) > + > + ifdef CONFIG_GCC_PLUGIN_RANDSTRUCT > + $(error error: CONFIG_GCC_PLUGIN_RANDSTRUCT requires out-of-tree modules to be built using the same GCC version as the kernel.) > + endif > + > + $(warning warning: Disabling GCC plugins for out-of-tree modules due to GCC version mismatch.) > + $(warning warning: The following plugins have been disabled: $(gcc-plugin-y)) > + GCC_PLUGINS_CFLAGS := > + endif > + endif > +endif > + > export GCC_PLUGINS_CFLAGS > > # Add the flags to the build! > diff --git a/scripts/Makefile.kcov b/scripts/Makefile.kcov > index 67e8cfe3474b..63a2bc2aabb2 100644 > --- a/scripts/Makefile.kcov > +++ b/scripts/Makefile.kcov > @@ -3,4 +3,15 @@ kcov-flags-$(CONFIG_CC_HAS_SANCOV_TRACE_PC) += -fsanitize-coverage=trace-pc > kcov-flags-$(CONFIG_KCOV_ENABLE_COMPARISONS) += -fsanitize-coverage=trace-cmp > kcov-flags-$(CONFIG_GCC_PLUGIN_SANCOV) += -fplugin=$(objtree)/scripts/gcc-plugins/sancov_plugin.so > > +# Out-of-tree module check for GCC version mismatch. > +# See the similar check in scripts/Makefile.gcc-plugins > +ifneq ($(CONFIG_GCC_PLUGIN_SANCOV),) > + ifneq ($(KBUILD_EXTMOD),) > + ifneq ($(CONFIG_GCC_VERSION), $(shell $(srctree)/scripts/gcc-version.sh $(HOSTCXX))) > + $(warning warning: Disabling CONFIG_GCC_PLUGIN_SANCOV for out-of-tree modules due to GCC version mismatch.) > + kcov-flags-y := $(filter-out %/sancov_plugin.so, $(kcov-flags-y)) > + endif > + endif > +endif > + > export CFLAGS_KCOV := $(kcov-flags-y) > -- > 2.29.2 > ---end quoted text---
On Tue, Jan 26, 2021 at 06:44:44AM -0600, Justin Forbes wrote: > Support and enforce are 2 completely different things. To shed a bit > more light on this, the real issue that prompted this was breaking CI > systems. As we enabled gcc plugins in Fedora, and the toolchain folks > went through 3 different snapshots of gcc 11 in a week. Any CI process > that built an out of tree module failed. I don't think this is nearly > as much of a concern for stable distros, as it is for CI in > development cycles. And the answer is trivial: don't build out of tree modules.
On Wed, Jan 27, 2021 at 06:02:15PM +0000, Christoph Hellwig wrote: > Please don't add all this garbage. We only add infrastructure to the > kernel for what the kernel itself needs, not for weird out of tree > infrastructure. This isn't new, the kernel already has the infrastructure for building out-of-tree modules. It's widely used. Are you suggesting we remove it? Good luck with that... Either it should be supported, or not. Make the case either way. But I can't understand why people are advocating to leave it half-broken.
On Wed, Jan 27, 2021 at 12:38:56PM -0600, Josh Poimboeuf wrote: > On Wed, Jan 27, 2021 at 06:02:15PM +0000, Christoph Hellwig wrote: > > Please don't add all this garbage. We only add infrastructure to the > > kernel for what the kernel itself needs, not for weird out of tree > > infrastructure. > > This isn't new, the kernel already has the infrastructure for building > out-of-tree modules. It's widely used. Are you suggesting we remove > it? Good luck with that... > > Either it should be supported, or not. Make the case either way. But I > can't understand why people are advocating to leave it half-broken. It is not support as any kind of interface. It is a little aid for local development. Adding any kond of complexities for out of tree modules is a complete no-go.
On Wed, Jan 27, 2021 at 06:43:27PM +0000, Christoph Hellwig wrote: > On Wed, Jan 27, 2021 at 12:38:56PM -0600, Josh Poimboeuf wrote: > > On Wed, Jan 27, 2021 at 06:02:15PM +0000, Christoph Hellwig wrote: > > > Please don't add all this garbage. We only add infrastructure to the > > > kernel for what the kernel itself needs, not for weird out of tree > > > infrastructure. > > > > This isn't new, the kernel already has the infrastructure for building > > out-of-tree modules. It's widely used. Are you suggesting we remove > > it? Good luck with that... > > > > Either it should be supported, or not. Make the case either way. But I > > can't understand why people are advocating to leave it half-broken. > > > It is not support as any kind of interface. It is a little aid for > local development. Is this a joke? I've never met anybody who builds OOT modules as a development aid... On the other hand I know of several very popular distros (some paid, some not) who rely on allowing users/partners to build OOT modules as part of their ecosystem. To say it's not supported is a farce.
From: Josh Poimboeuf > Sent: 27 January 2021 18:51 > > On Wed, Jan 27, 2021 at 06:43:27PM +0000, Christoph Hellwig wrote: > > On Wed, Jan 27, 2021 at 12:38:56PM -0600, Josh Poimboeuf wrote: > > > On Wed, Jan 27, 2021 at 06:02:15PM +0000, Christoph Hellwig wrote: > > > > Please don't add all this garbage. We only add infrastructure to the > > > > kernel for what the kernel itself needs, not for weird out of tree > > > > infrastructure. > > > > > > This isn't new, the kernel already has the infrastructure for building > > > out-of-tree modules. It's widely used. Are you suggesting we remove > > > it? Good luck with that... > > > > > > Either it should be supported, or not. Make the case either way. But I > > > can't understand why people are advocating to leave it half-broken. > > > > > > It is not support as any kind of interface. It is a little aid for > > local development. > > Is this a joke? I've never met anybody who builds OOT modules as a > development aid... > > On the other hand I know of several very popular distros (some paid, > some not) who rely on allowing users/partners to build OOT modules as > part of their ecosystem. To say it's not supported is a farce. Indeed there are plenty of companies who provide kernel modules (wholly or partly in source form) for their customers to build as OOT modules to install in distro built kernels. These modules have to compile against everything from RHEL6 (2.6.32 base) through to the current -rc release. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On Wed, Jan 27, 2021 at 12:51:13PM -0600, Josh Poimboeuf wrote: > Is this a joke? I've never met anybody who builds OOT modules as a > development aid... I'm pretty sure you've met me before. > On the other hand I know of several very popular distros (some paid, > some not) who rely on allowing users/partners to build OOT modules as > part of their ecosystem. To say it's not supported is a farce. This is not a farce. The kernel only supports infrastructure for the kernel itself, not for any external consumers. If you have a business model that relies on something else you should think hard if you are in the right business.
On Thu, Jan 28, 2021 at 02:29:52PM +0000, Christoph Hellwig wrote: > On Wed, Jan 27, 2021 at 12:51:13PM -0600, Josh Poimboeuf wrote: > > Is this a joke? I've never met anybody who builds OOT modules as a > > development aid... > > I'm pretty sure you've met me before. Yes, but I don't recall this topic coming up :-) > > On the other hand I know of several very popular distros (some paid, > > some not) who rely on allowing users/partners to build OOT modules as > > part of their ecosystem. To say it's not supported is a farce. > > This is not a farce. The kernel only supports infrastructure for the > kernel itself, not for any external consumers. If you have a business > model that relies on something else you should think hard if you are in > the right business. Thanks for letting me know. I'll send the memo to the billions of affected devices.
On Mon, Jan 25, 2021 at 02:42:10PM -0600, Josh Poimboeuf wrote: > When building out-of-tree kernel modules, the build system doesn't > require the GCC version to match the version used to build the original > kernel. That's probably [1] fine. > > In fact, for many distros, the version of GCC used to build the latest > kernel doesn't necessarily match the latest released GCC, so a GCC > mismatch turns out to be pretty common. And with CONFIG_MODVERSIONS > it's probably more common. > > So a lot of users have come to rely on being able to use a different > version of GCC when building OOT modules. > > But with GCC plugins enabled, that's no longer allowed: > > cc1: error: incompatible gcc/plugin versions > cc1: error: failed to initialize plugin ./scripts/gcc-plugins/structleak_plugin.so > > That error comes from the plugin's call to > plugin_default_version_check(), which strictly enforces the GCC version. > The strict check makes sense, because there's nothing to prevent the GCC > plugin ABI from changing -- and it often does. > > But failing the build isn't necessary. For most plugins, OOT modules > will otherwise work just fine without the plugin instrumentation. > > When a GCC version mismatch is detected, print a warning and disable the > plugin. The only exception is the RANDSTRUCT plugin which needs all > code to see the same struct layouts. In that case print an error. Hi Masahiro, This problem is becoming more prevalent. We will need to fix it one way or another, if we want to support distro adoption of these GCC plugin-based features. Frank suggested a possibly better idea: always rebuild the plugins when the GCC version changes. What do you think? Any suggestions on how to implement that? Otherwise I can try to hack something together.
On Wed, Mar 3, 2021 at 8:27 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote: > > On Mon, Jan 25, 2021 at 02:42:10PM -0600, Josh Poimboeuf wrote: > > When building out-of-tree kernel modules, the build system doesn't > > require the GCC version to match the version used to build the original > > kernel. That's probably [1] fine. > > > > In fact, for many distros, the version of GCC used to build the latest > > kernel doesn't necessarily match the latest released GCC, so a GCC > > mismatch turns out to be pretty common. And with CONFIG_MODVERSIONS > > it's probably more common. > > > > So a lot of users have come to rely on being able to use a different > > version of GCC when building OOT modules. > > > > But with GCC plugins enabled, that's no longer allowed: > > > > cc1: error: incompatible gcc/plugin versions > > cc1: error: failed to initialize plugin ./scripts/gcc-plugins/structleak_plugin.so > > > > That error comes from the plugin's call to > > plugin_default_version_check(), which strictly enforces the GCC version. > > The strict check makes sense, because there's nothing to prevent the GCC > > plugin ABI from changing -- and it often does. > > > > But failing the build isn't necessary. For most plugins, OOT modules > > will otherwise work just fine without the plugin instrumentation. > > > > When a GCC version mismatch is detected, print a warning and disable the > > plugin. The only exception is the RANDSTRUCT plugin which needs all > > code to see the same struct layouts. In that case print an error. > > Hi Masahiro, > > This problem is becoming more prevalent. We will need to fix it one way > or another, if we want to support distro adoption of these GCC > plugin-based features. > > Frank suggested a possibly better idea: always rebuild the plugins when > the GCC version changes. That is just another form of the previous patch, which was already rejected. - That is a hack just for external modules - Our consensus is, use the same version for the kernel and external modules I use Ubuntu, and I do not see such a problem. (I have never seen it on Debian either, except sid.) I see Fedora providing GCC whose version is different from the one used for building the kernel. That is a problem on the distribution side. In my ubuntu. $ grep CC_VERSION_TEXT /lib/modules/$(uname -r)/build/.config CONFIG_CC_VERSION_TEXT="gcc (Ubuntu 10.2.0-13ubuntu1) 10.2.0" $ gcc --version | head -n1 gcc (Ubuntu 10.2.0-13ubuntu1) 10.2.0 > What do you think? NACK. > Any suggestions on how to > implement that? Otherwise I can try to hack something together. > > -- > Josh > -- Best Regards Masahiro Yamada
On Thu, Mar 04, 2021 at 03:49:35AM +0900, Masahiro Yamada wrote: > > This problem is becoming more prevalent. We will need to fix it one way > > or another, if we want to support distro adoption of these GCC > > plugin-based features. > > > > Frank suggested a possibly better idea: always rebuild the plugins when > > the GCC version changes. > > > That is just another form of the previous patch, > which was already rejected. > > > - That is a hack just for external modules > - Our consensus is, use the same version for the kernel and external modules > > > > I use Ubuntu, and I do not see such a problem. > (I have never seen it on Debian either, except sid.) > > I see Fedora providing GCC whose version is different > from the one used for building the kernel. > That is a problem on the distribution side. I don't understand. Are you suggesting that a distro should always release GCC and kernel updates simultaneously? How is this problem specific to Fedora? Please be specific about what Ubuntu and Debian do, which Fedora doesn't do. Adding Linus, who indicated in another thread that we shouldn't force exact GCC versions because there's no technical reason to do so.
On Wed, Mar 3, 2021 at 11:15 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote: > > Adding Linus, who indicated in another thread that we shouldn't force > exact GCC versions because there's no technical reason to do so. I do not believe we should recompile everything just because the gcc version changes. But gcc _plugins_ certainly should depend on the kernel version. Very few people should be enabling the gcc plugins in the first place. Honestly, most of them are bad, and the people who really care about those things have already moved to clang which does the important parts natively without the need for a plugin. I'm personally waiting for the day when we can just say "let's remove them". But in the meantime, making the plugins depend on the gcc version some way is certainly better than not doing so. Linus
On Wed, Mar 03, 2021 at 11:25:34AM -0800, Linus Torvalds wrote: > On Wed, Mar 3, 2021 at 11:15 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote: > > > > Adding Linus, who indicated in another thread that we shouldn't force > > exact GCC versions because there's no technical reason to do so. > > I do not believe we should recompile everything just because the gcc > version changes. > > But gcc _plugins_ certainly should depend on the kernel version. > > Very few people should be enabling the gcc plugins in the first place. > Honestly, most of them are bad, and the people who really care about > those things have already moved to clang which does the important > parts natively without the need for a plugin. I'm personally waiting > for the day when we can just say "let's remove them". You might be sad to learn that some of the plugins are useful for hardening of a production distro kernel, like stackleak and structleak. > But in the meantime, making the plugins depend on the gcc version some > way is certainly better than not doing so. So currently, the plugins already so that. They require the GCC version to be exact. If there's a mismatch, then it fails the OOT module build. But that's not usable for a distro. When users build OOT modules with a slight GCC mismatch, it breaks the build, effectively requiring the exact same GCC version for *all* OOT builds going forward. So there have been a few proposals to better handle GCC version mismatches: 1) disable the plugin - this works fine for most plugins except randstruct 2) rebuild the plugin whenever the GCC version changes 3) fail the build, like today - effectively blocks distros from using plugins
On Wed, Mar 3, 2021 at 11:38 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote: > > > But in the meantime, making the plugins depend on the gcc version some > > way is certainly better than not doing so. > > So currently, the plugins already so that. They require the GCC version > to be exact. If there's a mismatch, then it fails the OOT module build. That's not my experience. Yes, the build fails, but it fails not by _rebuilding_, but by failing with an error. IOW, it's a dependency problem. That said, I absolutely think that distros that think that stackleak is an important plugin should seriously have a plan to just move to clang, or push the gcc people to just add the "-ftrivial-auto-var-init" thing (and talk sense to the clang people who think that zero isn't good, and want to force a "pattern", but happily they haven't taken over the world yet). The kernel gcc plugins _will_ go away eventually. They are an unmitigated disaster. They always have been. I'm sorry I ever merged that support. It's not only a maintenance nightmare, it's just a horrible thing and interface in the first place. It's literally BAD TECHNOLOGY. Gcc plugins were badly done. They _should_ have been done twenty years ago as a proper IR (and people very much asked for them), but for political reasons the FSF was very much against any kind of intermediate representation that could be hooked into. It's one of the reasons clang has been so successful - having the whole LLVM IR model has made life _so_ much better for anybody working on any kind of compiler that it's not even funny. Gcc plugins were too little, too late, and are not even remotely a good model technically. LLVM did things right with a well-defined IR front and center, and while I dearly love gcc for a lot of reasons, I absolutely despise how badly gcc handled this all - and I despise how that horrible decision was never about technology, and was always due to bad politics on the part of FSF and rms. End result: gcc plugins are pure garbage, and you should shun them. If you really believe you need compiler plugins, you should look at clang. That really is the only sane technical answer. Linus
On Wed, Mar 03, 2021 at 11:57:33AM -0800, Linus Torvalds wrote: > On Wed, Mar 3, 2021 at 11:38 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote: > > > > > But in the meantime, making the plugins depend on the gcc version some > > > way is certainly better than not doing so. > > > > So currently, the plugins already so that. They require the GCC version > > to be exact. If there's a mismatch, then it fails the OOT module build. > > That's not my experience. > > Yes, the build fails, but it fails not by _rebuilding_, but by failing > with an error. Um, that's what I said. It does not rebuild. It fails with an error. The *proposal* is to rebuild the plugin -- which Masahiro nacked because he claims GCC mismatches aren't supported for OOT builds (plugin or not). Your nack is for a different reason: GCC plugins are second-class citizens. Fair enough...
On Wed, Mar 03, 2021 at 02:24:12PM -0600, Josh Poimboeuf wrote: > On Wed, Mar 03, 2021 at 11:57:33AM -0800, Linus Torvalds wrote: > > On Wed, Mar 3, 2021 at 11:38 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote: > > > > > > > But in the meantime, making the plugins depend on the gcc version some > > > > way is certainly better than not doing so. > > > > > > So currently, the plugins already so that. They require the GCC version > > > to be exact. If there's a mismatch, then it fails the OOT module build. > > > > That's not my experience. > > > > Yes, the build fails, but it fails not by _rebuilding_, but by failing > > with an error. > > Um, that's what I said. It does not rebuild. It fails with an error. > > The *proposal* is to rebuild the plugin -- which Masahiro nacked because > he claims GCC mismatches aren't supported for OOT builds (plugin or > not). > > Your nack is for a different reason: GCC plugins are second-class > citizens. Fair enough... Or was it a nack? :-) Reading your message again, I may have misinterpreted. Put simply, should we rebuild plugins when the GCC version changes?
On Wed, Mar 3, 2021 at 12:24 PM Josh Poimboeuf <jpoimboe@redhat.com> wrote: > > Your nack is for a different reason: GCC plugins are second-class > citizens. Fair enough... MNo, I didn't NAK it. Quite the reverser. I am ABSOLUTELY against rebuilding normal object files just because gcc versions change. A compiler version change makes zero difference for any normal object file. But the gcc plugins are different. They very much _are_ tied to a particular gcc version. Now, they are tied to a particular gcc version because they are horribly badly done, and bad technology, and I went off on a bit of a rant about just how bad they are, but the point is that gcc plugins depend on the exact gcc version in ways that normal object files do _not_. Linus
On Wed, Mar 03, 2021 at 12:56:52PM -0800, Linus Torvalds wrote: > On Wed, Mar 3, 2021 at 12:24 PM Josh Poimboeuf <jpoimboe@redhat.com> wrote: > > > > Your nack is for a different reason: GCC plugins are second-class > > citizens. Fair enough... > > MNo, I didn't NAK it. Quite the reverser. > > I am ABSOLUTELY against rebuilding normal object files just because > gcc versions change. A compiler version change makes zero difference > for any normal object file. > > But the gcc plugins are different. They very much _are_ tied to a > particular gcc version. > > Now, they are tied to a particular gcc version because they are > horribly badly done, and bad technology, and I went off on a bit of a > rant about just how bad they are, but the point is that gcc plugins > depend on the exact gcc version in ways that normal object files do > _not_. Thanks, reading comprehension is hard. I realized after re-reading that I interpreted your "plugins should depend on the kernel version" statement too broadly. Masahiro, any idea how I can make the GCC version a build dependency?
On Wed, Mar 03, 2021 at 11:57:33AM -0800, Linus Torvalds wrote: > End result: gcc plugins are pure garbage, and you should shun them. If I think that's pretty harsh, but okay, opinions are opinions. As Josh says, people are interested in them for not-uncommon real-world uses: - stackleak has data-lifetime wiping coverage that -ftrivial-auto-var-init=zero for either Clang or GCC doesn't cover. There are no plans to provide such coverage under Clang yet. It's arguable that stackleak's benefits are smaller than it's maintenance burden compared to having -ftrivial-auto-var-init=zero, but we can have that conversation when we get there. :) - structleak is likely to vanish as soon as GCC supports -ftrivial-auto-var-init=zero. Clang's implementation is done and in use by every Clang-built kernel I know of. - latent_entropy is likely less useful since the jitter entropy was added, but I've not seen anyone analyze it. No "upstream" GCC nor Clang support is planned. - arm32 per-task stack protector canary meaningfully reduces the risk of stack content exposures vs stack frame overwrites, but neither GCC nor Clang seem interested in implementing this "correctly" (as done for arm64 on GCC -- Clang doesn't have this for arm64 yet). I want this fixed for arm64 on Clang, and maybe arm32 can be done at the same time. - randstruct is likely not used for distro kernels, but very much for end users where security has a giant priority over performance. There's no "upstream" GCC plan for this, and the Clang support has stalled. > you really believe you need compiler plugins, you should look at > clang. This is currently true only in 1 case (structleak), and only a few "traditional" distros are building the kernel with Clang right now. I don't disagree: doing this via LLVM IR would be much easier, but the implementations for the other above features don't exist yet. I expect this to change over time (I expect Clang's randstruct and GCC's -ftrivial-auto-var-init=zero to likely be the next two things to appear), but it's not the case right now.
On Thu, Mar 4, 2021 at 4:15 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote: > > On Thu, Mar 04, 2021 at 03:49:35AM +0900, Masahiro Yamada wrote: > > > This problem is becoming more prevalent. We will need to fix it one way > > > or another, if we want to support distro adoption of these GCC > > > plugin-based features. > > > > > > Frank suggested a possibly better idea: always rebuild the plugins when > > > the GCC version changes. > > > > > > That is just another form of the previous patch, > > which was already rejected. > > > > > > - That is a hack just for external modules > > - Our consensus is, use the same version for the kernel and external modules > > > > > > > > I use Ubuntu, and I do not see such a problem. > > (I have never seen it on Debian either, except sid.) > > > > I see Fedora providing GCC whose version is different > > from the one used for building the kernel. > > That is a problem on the distribution side. > > I don't understand. Are you suggesting that a distro should always > release GCC and kernel updates simultaneously? Well, as Greg says, given that GCC is less frequently upgraded than the kernel, this should not be a big deal. https://lore.kernel.org/lkml/YBBML2eB%2FIXcTvcn@kroah.com/ > How is this problem specific to Fedora? Please be specific about what > Ubuntu and Debian do, which Fedora doesn't do. I do not know what they exactly do. Looking at only the result, Ubuntu and Debian do proper engineering, which Fedora doesn't. > > Adding Linus, who indicated in another thread that we shouldn't force > exact GCC versions because there's no technical reason to do so. > > -- > Josh >
On Thu, Mar 4, 2021 at 6:45 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote: > > On Wed, Mar 03, 2021 at 12:56:52PM -0800, Linus Torvalds wrote: > > On Wed, Mar 3, 2021 at 12:24 PM Josh Poimboeuf <jpoimboe@redhat.com> wrote: > > > > > > Your nack is for a different reason: GCC plugins are second-class > > > citizens. Fair enough... > > > > MNo, I didn't NAK it. Quite the reverser. > > > > I am ABSOLUTELY against rebuilding normal object files just because > > gcc versions change. A compiler version change makes zero difference > > for any normal object file. > > > > But the gcc plugins are different. They very much _are_ tied to a > > particular gcc version. > > > > Now, they are tied to a particular gcc version because they are > > horribly badly done, and bad technology, and I went off on a bit of a > > rant about just how bad they are, but the point is that gcc plugins > > depend on the exact gcc version in ways that normal object files do > > _not_. > > Thanks, reading comprehension is hard. I realized after re-reading that > I interpreted your "plugins should depend on the kernel version" > statement too broadly. > > Masahiro, any idea how I can make the GCC version a build dependency? I agree with rebuilding GCC plugins when the compiler is upgraded for *in-tree* building. Linus had reported it a couple of months before, and I just submitted a very easy fix. Rebuilding plugins for external modules is not easy; plugins are placed in the read-only directory, /usr/src/linux-headers-$(uname -r)/scripts/gcc-plugins/. The external modules must not (cannot) update in-tree build artifacts. "Rebuild" means creating copies in a different writable directory. Doing that requires a lot of design changes.
On Thu, Mar 04, 2021 at 09:27:28PM +0900, Masahiro Yamada wrote: > I agree with rebuilding GCC plugins when the compiler is upgraded > for *in-tree* building. > Linus had reported it a couple of months before, > and I just submitted a very easy fix. Hm? So does that mean that a GCC version change won't trigger a tree-wide rebuild? So you're asserting that a GCC mismatch is ok for in-tree code, but not for external modules??? That seems backwards. For in-tree, why not just rebuild the entire tree? Some kernel features are dependent on compiler version or capability, so not rebuilding the tree could introduce silent breakage. For external modules, a tree-wide rebuild isn't an option so the risk is assumed by the user. I posted a patch earlier [1] which prints a warning if the compiler major/minor version changes with an external module build. [1] https://lkml.kernel.org/r/20210201211322.t2rxmvnrystc2ky7@treble > Rebuilding plugins for external modules is not easy; > plugins are placed in the read-only directory, > /usr/src/linux-headers-$(uname -r)/scripts/gcc-plugins/. > > The external modules must not (cannot) update in-tree > build artifacts. "Rebuild" means creating copies in a different > writable directory. > Doing that requires a lot of design changes. Ok. So it sounds like the best/easiest option is the original patch in this thread: when building an external module with a GCC mismatch, just disable the GCC plugin, with a warning (or an error for randstruct).
On Fri, Mar 5, 2021 at 12:08 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote: > > On Thu, Mar 04, 2021 at 09:27:28PM +0900, Masahiro Yamada wrote: > > I agree with rebuilding GCC plugins when the compiler is upgraded > > for *in-tree* building. > > Linus had reported it a couple of months before, > > and I just submitted a very easy fix. > > Hm? So does that mean that a GCC version change won't trigger a > tree-wide rebuild? So you're asserting that a GCC mismatch is ok for > in-tree code, but not for external modules??? That seems backwards. > > For in-tree, why not just rebuild the entire tree? Some kernel features > are dependent on compiler version or capability, so not rebuilding the > tree could introduce silent breakage. All the kernel-space objects are rebuilt when the compiler is upgraded. (See commit 8b59cd81dc5e724eaea283fa6006985891c7bff4) Linus complaint about GCC plugins not being rebuilt. ^^^^^^^^^^^ https://lore.kernel.org/lkml/CAHk-=wieoN5ttOy7SnsGwZv+Fni3R6m-Ut=oxih6bbZ28G+4dw@mail.gmail.com/ That is easy to fix. I submitted a patch: https://patchwork.kernel.org/project/linux-kbuild/patch/20210304113708.215121-1-masahiroy@kernel.org/ > For external modules, a tree-wide rebuild isn't an option so the risk is > assumed by the user. I posted a patch earlier [1] which prints a > warning if the compiler major/minor version changes with an external > module build. > > [1] https://lkml.kernel.org/r/20210201211322.t2rxmvnrystc2ky7@treble > > > Rebuilding plugins for external modules is not easy; > > plugins are placed in the read-only directory, > > /usr/src/linux-headers-$(uname -r)/scripts/gcc-plugins/. > > > > The external modules must not (cannot) update in-tree > > build artifacts. "Rebuild" means creating copies in a different > > writable directory. > > Doing that requires a lot of design changes. > > Ok. So it sounds like the best/easiest option is the original patch in > this thread: when building an external module with a GCC mismatch, just > disable the GCC plugin, with a warning (or an error for randstruct). It was rejected. If a distribution wants to enable CONFIG_GCC_PLUGINS, it must provide GCC whose version is the same as used for building the kernel. If a distribution cannot manage release in that way, do not enable CONFIG_GCC_PLUGINS in the first place.
On Thu, Mar 4, 2021 at 7:36 AM Masahiro Yamada <masahiroy@kernel.org> wrote: > > All the kernel-space objects are rebuilt > when the compiler is upgraded. I very much NAK'ed that one. Why did that go in? Or maybe I NAK'ed another version of it (I think the one I NAK'ed was from Josh), and didn't realize that there were multiple ones. > Linus complaint about GCC plugins not being rebuilt. Yes, and that was a separate complaint and not at all tied to the other objects. Gcc plugins aren't kernel object files at all. They are very much like dynamically loadable libraries to gcc itself. Linus
On Thu, Mar 04, 2021 at 11:12:42AM -0800, Linus Torvalds wrote: > On Thu, Mar 4, 2021 at 7:36 AM Masahiro Yamada <masahiroy@kernel.org> wrote: > > > > All the kernel-space objects are rebuilt > > when the compiler is upgraded. > > I very much NAK'ed that one. Why did that go in? > > Or maybe I NAK'ed another version of it (I think the one I NAK'ed was > from Josh), and didn't realize that there were multiple ones. This thread is making me dizzy, but I think the patch you NAK'ed from me was different. It just added an error on GCC mismatch with external modules: https://lkml.kernel.org/r/fff056a7c9e6050c2d60910f70b6d99602f3bec4.1611863075.git.jpoimboe@redhat.com though I think you were ok with downgrading it to a warning.
On Thu, Mar 4, 2021 at 6:41 PM Josh Poimboeuf <jpoimboe@redhat.com> wrote: > > This thread is making me dizzy, but I think the patch you NAK'ed from me > was different. It just added an error on GCC mismatch with external > modules: > > https://lkml.kernel.org/r/fff056a7c9e6050c2d60910f70b6d99602f3bec4.1611863075.git.jpoimboe@redhat.com > > though I think you were ok with downgrading it to a warning. Right you are. So that was about the external module thing, which is obviously incidentally an issue here with the plugin behavior too, but different. My inbox is kind of a modern-day Colossal Cave adventure: "You are in a maze of twisty email threads, all similar but with different hidden details". Linus
On Fri, Mar 5, 2021 at 4:13 AM Linus Torvalds <torvalds@linux-foundation.org> wrote: > > On Thu, Mar 4, 2021 at 7:36 AM Masahiro Yamada <masahiroy@kernel.org> wrote: > > > > All the kernel-space objects are rebuilt > > when the compiler is upgraded. > > I very much NAK'ed that one. Why did that go in? When the compiler is upgraded, all objects should be rebuilt by the new compiler, - this keeps Kbuild deterministic, irrespective of whether it is a fresh build, or incremental build. If we do not force the full rebuild, the banner at boot time is no point. [ 0.000000] Linux version 5.8.0-44-generic (buildd@lgw01-amd64-039) (gcc (Ubuntu 10.2.0-13ubuntu1) 10.2.0, GNU ld (GNU Binutils for Ubuntu) 2.35.1) #50-Ubuntu SMP Tue Feb 9 06:29:41 UTC 2021 (Ubuntu 5.8.0-44.50-generic 5.8.18) It claims it was built by gcc (Ubuntu 10.2.0-13ubuntu1) 10.2.0 but we would never know if it is true for whole objects. Some of them might have been compiled by an older compiler. > Or maybe I NAK'ed another version of it (I think the one I NAK'ed was > from Josh), and didn't realize that there were multiple ones. > > > Linus complaint about GCC plugins not being rebuilt. > > Yes, and that was a separate complaint and not at all tied to the other objects. > > Gcc plugins aren't kernel object files at all. They are very much like > dynamically loadable libraries to gcc itself. > > Linus
On Fri, Mar 5, 2021 at 11:41 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote: > > On Thu, Mar 04, 2021 at 11:12:42AM -0800, Linus Torvalds wrote: > > On Thu, Mar 4, 2021 at 7:36 AM Masahiro Yamada <masahiroy@kernel.org> wrote: > > > > > > All the kernel-space objects are rebuilt > > > when the compiler is upgraded. > > > > I very much NAK'ed that one. Why did that go in? > > > > Or maybe I NAK'ed another version of it (I think the one I NAK'ed was > > from Josh), and didn't realize that there were multiple ones. > > This thread is making me dizzy, Me too. > Ok. So it sounds like the best/easiest option is the original patch in > this thread: when building an external module with a GCC mismatch, just > disable the GCC plugin, with a warning (or an error for randstruct). Just for clarification, I believe "the original patch" pointed to this one: https://lore.kernel.org/lkml/efe6b039a544da8215d5e54aa7c4b6d1986fc2b0.1611607264.git.jpoimboe@redhat.com/ This is dead. Please do not come back to this. See negative comments not only from me, but also from Greg, Peter, Christoph. > but I think the patch you NAK'ed from me > was different. It just added an error on GCC mismatch with external > modules: > > https://lkml.kernel.org/r/fff056a7c9e6050c2d60910f70b6d99602f3bec4.1611863075.git.jpoimboe@redhat.com > > though I think you were ok with downgrading it to a warning. I think this was not NAKed. At first, Linus NAKed it, but I think it was just because of some misunderstanding. I do not have an objection about checking compiler version difference for external module builds. I was pointing out some mistakes in the Makefile implementation. (I would say "not that way" even with your second trial). -- Best Regards Masahiro Yamada
On Sat, Mar 06, 2021 at 01:03:32AM +0900, Masahiro Yamada wrote: > > Ok. So it sounds like the best/easiest option is the original patch in > > this thread: when building an external module with a GCC mismatch, just > > disable the GCC plugin, with a warning (or an error for randstruct). > > Just for clarification, > I believe "the original patch" pointed to this one: > https://lore.kernel.org/lkml/efe6b039a544da8215d5e54aa7c4b6d1986fc2b0.1611607264.git.jpoimboe@redhat.com/ > > This is dead. Please do not come back to this. Sorry, no. The patch may have been crap, but that doesn't make the problem I'm trying to solve any less valid. > See negative comments not only from me, but also from Greg, Peter, > Christoph. I responded to those. Summarizing my replies once again... - "External modules aren't supported" This doesn't even remotely match reality. Are you honestly using this "negative comment" as a reason to NAK the patch? - "External modules must be built with the same GCC version" As has been stated repeatedly, by Linus and others, there's no technical reason behind this claim. It ignores the realities of how distros release the kernel and compiler independently, with separate cadences. Minor variances in compiler version are ABI compatible. Also, for features which are dependent on compiler version, many of those are now enabled by kbuild. As I suggested to you previously, kbuild should warn when such features get disabled (which can happen due to a compiler/toolchain change or due to a .config copied from another system).
> - "External modules must be built with the same GCC version" > > As has been stated repeatedly, by Linus and others, there's no > technical reason behind this claim. It ignores the realities of how > distros release the kernel and compiler independently, with separate > cadences. Minor variances in compiler version are ABI compatible. Aren't major versions ABI compatible? Otherwise it is a different architecture. > Also, for features which are dependent on compiler version, many of > those are now enabled by kbuild. As I suggested to you previously, > kbuild should warn when such features get disabled (which can happen > due to a compiler/toolchain change or due to a .config copied from > another system). Anything that is just an optimisation (or pessimisation) really doesn't matter. AFAICT most of these options are in the latter category. One think people might do is download a kernel from kernel.org and then build an extra module for it. The compiler they have will be completely different. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
diff --git a/scripts/Makefile.gcc-plugins b/scripts/Makefile.gcc-plugins index 952e46876329..7227692fba59 100644 --- a/scripts/Makefile.gcc-plugins +++ b/scripts/Makefile.gcc-plugins @@ -51,6 +51,25 @@ export DISABLE_ARM_SSP_PER_TASK_PLUGIN GCC_PLUGINS_CFLAGS := $(strip $(addprefix -fplugin=$(objtree)/scripts/gcc-plugins/, $(gcc-plugin-y)) $(gcc-plugin-cflags-y)) # The sancov_plugin.so is included via CFLAGS_KCOV, so it is removed here. GCC_PLUGINS_CFLAGS := $(filter-out %/sancov_plugin.so, $(GCC_PLUGINS_CFLAGS)) + +# Out-of-tree module check: If there's a GCC version mismatch, disable plugins +# and print a warning. Otherwise the OOT module build will fail due to +# plugin_default_version_check(). +ifneq ($(GCC_PLUGINS_CFLAGS),) + ifneq ($(KBUILD_EXTMOD),) + ifneq ($(CONFIG_GCC_VERSION), $(shell $(srctree)/scripts/gcc-version.sh $(HOSTCXX))) + + ifdef CONFIG_GCC_PLUGIN_RANDSTRUCT + $(error error: CONFIG_GCC_PLUGIN_RANDSTRUCT requires out-of-tree modules to be built using the same GCC version as the kernel.) + endif + + $(warning warning: Disabling GCC plugins for out-of-tree modules due to GCC version mismatch.) + $(warning warning: The following plugins have been disabled: $(gcc-plugin-y)) + GCC_PLUGINS_CFLAGS := + endif + endif +endif + export GCC_PLUGINS_CFLAGS # Add the flags to the build! diff --git a/scripts/Makefile.kcov b/scripts/Makefile.kcov index 67e8cfe3474b..63a2bc2aabb2 100644 --- a/scripts/Makefile.kcov +++ b/scripts/Makefile.kcov @@ -3,4 +3,15 @@ kcov-flags-$(CONFIG_CC_HAS_SANCOV_TRACE_PC) += -fsanitize-coverage=trace-pc kcov-flags-$(CONFIG_KCOV_ENABLE_COMPARISONS) += -fsanitize-coverage=trace-cmp kcov-flags-$(CONFIG_GCC_PLUGIN_SANCOV) += -fplugin=$(objtree)/scripts/gcc-plugins/sancov_plugin.so +# Out-of-tree module check for GCC version mismatch. +# See the similar check in scripts/Makefile.gcc-plugins +ifneq ($(CONFIG_GCC_PLUGIN_SANCOV),) + ifneq ($(KBUILD_EXTMOD),) + ifneq ($(CONFIG_GCC_VERSION), $(shell $(srctree)/scripts/gcc-version.sh $(HOSTCXX))) + $(warning warning: Disabling CONFIG_GCC_PLUGIN_SANCOV for out-of-tree modules due to GCC version mismatch.) + kcov-flags-y := $(filter-out %/sancov_plugin.so, $(kcov-flags-y)) + endif + endif +endif + export CFLAGS_KCOV := $(kcov-flags-y)
When building out-of-tree kernel modules, the build system doesn't require the GCC version to match the version used to build the original kernel. That's probably [1] fine. In fact, for many distros, the version of GCC used to build the latest kernel doesn't necessarily match the latest released GCC, so a GCC mismatch turns out to be pretty common. And with CONFIG_MODVERSIONS it's probably more common. So a lot of users have come to rely on being able to use a different version of GCC when building OOT modules. But with GCC plugins enabled, that's no longer allowed: cc1: error: incompatible gcc/plugin versions cc1: error: failed to initialize plugin ./scripts/gcc-plugins/structleak_plugin.so That error comes from the plugin's call to plugin_default_version_check(), which strictly enforces the GCC version. The strict check makes sense, because there's nothing to prevent the GCC plugin ABI from changing -- and it often does. But failing the build isn't necessary. For most plugins, OOT modules will otherwise work just fine without the plugin instrumentation. When a GCC version mismatch is detected, print a warning and disable the plugin. The only exception is the RANDSTRUCT plugin which needs all code to see the same struct layouts. In that case print an error. [1] Ignoring, for the moment, that the kernel now has toolchain-dependent kconfig options, which can silently disable features and cause havoc when compiler versions differ, or even when certain libraries are missing. This is a separate problem which also needs to be addressed. Reported-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> --- scripts/Makefile.gcc-plugins | 19 +++++++++++++++++++ scripts/Makefile.kcov | 11 +++++++++++ 2 files changed, 30 insertions(+)