diff mbox series

[RFC] gcc-plugins: Handle GCC version mismatch for OOT modules

Message ID efe6b039a544da8215d5e54aa7c4b6d1986fc2b0.1611607264.git.jpoimboe@redhat.com (mailing list archive)
State New, archived
Headers show
Series [RFC] gcc-plugins: Handle GCC version mismatch for OOT modules | expand

Commit Message

Josh Poimboeuf Jan. 25, 2021, 8:42 p.m. UTC
When building out-of-tree kernel modules, the build system doesn't
require the GCC version to match the version used to build the original
kernel.  That's probably [1] fine.

In fact, for many distros, the version of GCC used to build the latest
kernel doesn't necessarily match the latest released GCC, so a GCC
mismatch turns out to be pretty common.  And with CONFIG_MODVERSIONS
it's probably more common.

So a lot of users have come to rely on being able to use a different
version of GCC when building OOT modules.

But with GCC plugins enabled, that's no longer allowed:

  cc1: error: incompatible gcc/plugin versions
  cc1: error: failed to initialize plugin ./scripts/gcc-plugins/structleak_plugin.so

That error comes from the plugin's call to
plugin_default_version_check(), which strictly enforces the GCC version.
The strict check makes sense, because there's nothing to prevent the GCC
plugin ABI from changing -- and it often does.

But failing the build isn't necessary.  For most plugins, OOT modules
will otherwise work just fine without the plugin instrumentation.

When a GCC version mismatch is detected, print a warning and disable the
plugin.  The only exception is the RANDSTRUCT plugin which needs all
code to see the same struct layouts.  In that case print an error.

[1] Ignoring, for the moment, that the kernel now has
    toolchain-dependent kconfig options, which can silently disable
    features and cause havoc when compiler versions differ, or even when
    certain libraries are missing.  This is a separate problem which
    also needs to be addressed.

Reported-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
---
 scripts/Makefile.gcc-plugins | 19 +++++++++++++++++++
 scripts/Makefile.kcov        | 11 +++++++++++
 2 files changed, 30 insertions(+)

Comments

Masahiro Yamada Jan. 25, 2021, 9:16 p.m. UTC | #1
On Tue, Jan 26, 2021 at 5:42 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
>
> When building out-of-tree kernel modules, the build system doesn't
> require the GCC version to match the version used to build the original
> kernel.  That's probably [1] fine.
>
> In fact, for many distros, the version of GCC used to build the latest
> kernel doesn't necessarily match the latest released GCC, so a GCC
> mismatch turns out to be pretty common.  And with CONFIG_MODVERSIONS
> it's probably more common.
>
> So a lot of users have come to rely on being able to use a different
> version of GCC when building OOT modules.
>
> But with GCC plugins enabled, that's no longer allowed:
>
>   cc1: error: incompatible gcc/plugin versions
>   cc1: error: failed to initialize plugin ./scripts/gcc-plugins/structleak_plugin.so
>
> That error comes from the plugin's call to
> plugin_default_version_check(), which strictly enforces the GCC version.
> The strict check makes sense, because there's nothing to prevent the GCC
> plugin ABI from changing -- and it often does.
>
> But failing the build isn't necessary.  For most plugins, OOT modules
> will otherwise work just fine without the plugin instrumentation.
>
> When a GCC version mismatch is detected, print a warning and disable the
> plugin.  The only exception is the RANDSTRUCT plugin which needs all
> code to see the same struct layouts.  In that case print an error.
>
> [1] Ignoring, for the moment, that the kernel now has
>     toolchain-dependent kconfig options, which can silently disable
>     features and cause havoc when compiler versions differ, or even when
>     certain libraries are missing.  This is a separate problem which
>     also needs to be addressed.
>
> Reported-by: Ondrej Mosnacek <omosnace@redhat.com>
> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> ---


We are based on the assumption that we use the same
compiler for in-tree and out-of-tree.

If people use a different compiler, they must be
prepared for any possible problem.


Using different compiler flags for in-tree and out-of-tree
is even more dangerous.

For example, CONFIG_GCC_PLUGIN_RANDSTRUCT is enabled
for in-tree build, and then disabled for out-of-tree modules,
the struct layout will mismatch, won't it?


This patch is ugly, and not doing the right thing.




>  scripts/Makefile.gcc-plugins | 19 +++++++++++++++++++
>  scripts/Makefile.kcov        | 11 +++++++++++
>  2 files changed, 30 insertions(+)
>
> diff --git a/scripts/Makefile.gcc-plugins b/scripts/Makefile.gcc-plugins
> index 952e46876329..7227692fba59 100644
> --- a/scripts/Makefile.gcc-plugins
> +++ b/scripts/Makefile.gcc-plugins
> @@ -51,6 +51,25 @@ export DISABLE_ARM_SSP_PER_TASK_PLUGIN
>  GCC_PLUGINS_CFLAGS := $(strip $(addprefix -fplugin=$(objtree)/scripts/gcc-plugins/, $(gcc-plugin-y)) $(gcc-plugin-cflags-y))
>  # The sancov_plugin.so is included via CFLAGS_KCOV, so it is removed here.
>  GCC_PLUGINS_CFLAGS := $(filter-out %/sancov_plugin.so, $(GCC_PLUGINS_CFLAGS))
> +
> +# Out-of-tree module check: If there's a GCC version mismatch, disable plugins
> +# and print a warning.  Otherwise the OOT module build will fail due to
> +# plugin_default_version_check().
> +ifneq ($(GCC_PLUGINS_CFLAGS),)
> +    ifneq ($(KBUILD_EXTMOD),)
> +        ifneq ($(CONFIG_GCC_VERSION), $(shell $(srctree)/scripts/gcc-version.sh $(HOSTCXX)))
> +
> +            ifdef CONFIG_GCC_PLUGIN_RANDSTRUCT
> +                $(error error: CONFIG_GCC_PLUGIN_RANDSTRUCT requires out-of-tree modules to be built using the same GCC version as the kernel.)
> +            endif
> +
> +            $(warning warning: Disabling GCC plugins for out-of-tree modules due to GCC version mismatch.)
> +            $(warning warning: The following plugins have been disabled: $(gcc-plugin-y))
> +            GCC_PLUGINS_CFLAGS :=
> +       endif
> +    endif
> +endif
> +
>  export GCC_PLUGINS_CFLAGS
>
>  # Add the flags to the build!
> diff --git a/scripts/Makefile.kcov b/scripts/Makefile.kcov
> index 67e8cfe3474b..63a2bc2aabb2 100644
> --- a/scripts/Makefile.kcov
> +++ b/scripts/Makefile.kcov
> @@ -3,4 +3,15 @@ kcov-flags-$(CONFIG_CC_HAS_SANCOV_TRACE_PC)    += -fsanitize-coverage=trace-pc
>  kcov-flags-$(CONFIG_KCOV_ENABLE_COMPARISONS)   += -fsanitize-coverage=trace-cmp
>  kcov-flags-$(CONFIG_GCC_PLUGIN_SANCOV)         += -fplugin=$(objtree)/scripts/gcc-plugins/sancov_plugin.so
>
> +# Out-of-tree module check for GCC version mismatch.
> +# See the similar check in scripts/Makefile.gcc-plugins
> +ifneq ($(CONFIG_GCC_PLUGIN_SANCOV),)
> +    ifneq ($(KBUILD_EXTMOD),)
> +        ifneq ($(CONFIG_GCC_VERSION), $(shell $(srctree)/scripts/gcc-version.sh $(HOSTCXX)))
> +            $(warning warning: Disabling CONFIG_GCC_PLUGIN_SANCOV for out-of-tree modules due to GCC version mismatch.)
> +            kcov-flags-y := $(filter-out %/sancov_plugin.so, $(kcov-flags-y))
> +        endif
> +    endif
> +endif
> +
>  export CFLAGS_KCOV := $(kcov-flags-y)
> --
> 2.29.2
>
Josh Poimboeuf Jan. 25, 2021, 9:27 p.m. UTC | #2
On Tue, Jan 26, 2021 at 06:16:01AM +0900, Masahiro Yamada wrote:
> On Tue, Jan 26, 2021 at 5:42 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> >
> > When building out-of-tree kernel modules, the build system doesn't
> > require the GCC version to match the version used to build the original
> > kernel.  That's probably [1] fine.
> >
> > In fact, for many distros, the version of GCC used to build the latest
> > kernel doesn't necessarily match the latest released GCC, so a GCC
> > mismatch turns out to be pretty common.  And with CONFIG_MODVERSIONS
> > it's probably more common.
> >
> > So a lot of users have come to rely on being able to use a different
> > version of GCC when building OOT modules.
> >
> > But with GCC plugins enabled, that's no longer allowed:
> >
> >   cc1: error: incompatible gcc/plugin versions
> >   cc1: error: failed to initialize plugin ./scripts/gcc-plugins/structleak_plugin.so
> >
> > That error comes from the plugin's call to
> > plugin_default_version_check(), which strictly enforces the GCC version.
> > The strict check makes sense, because there's nothing to prevent the GCC
> > plugin ABI from changing -- and it often does.
> >
> > But failing the build isn't necessary.  For most plugins, OOT modules
> > will otherwise work just fine without the plugin instrumentation.
> >
> > When a GCC version mismatch is detected, print a warning and disable the
> > plugin.  The only exception is the RANDSTRUCT plugin which needs all
> > code to see the same struct layouts.  In that case print an error.
> >
> > [1] Ignoring, for the moment, that the kernel now has
> >     toolchain-dependent kconfig options, which can silently disable
> >     features and cause havoc when compiler versions differ, or even when
> >     certain libraries are missing.  This is a separate problem which
> >     also needs to be addressed.
> >
> > Reported-by: Ondrej Mosnacek <omosnace@redhat.com>
> > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> > ---
> 
> 
> We are based on the assumption that we use the same
> compiler for in-tree and out-of-tree.

Sorry, but that assumption isn't based in reality.  And it's not
enforced.

> If people use a different compiler, they must be
> prepared for any possible problem.
>
> Using different compiler flags for in-tree and out-of-tree
> is even more dangerous.
> 
> For example, CONFIG_GCC_PLUGIN_RANDSTRUCT is enabled
> for in-tree build, and then disabled for out-of-tree modules,
> the struct layout will mismatch, won't it?

If you read the patch you'll notice that it handles that case, when it's
caused by GCC mismatch.

However, as alluded to in the [1] footnote, it doesn't handle the case
where the OOT build system doesn't have gcc-plugin-devel installed.
Then CONFIG_GCC_PLUGIN_RANDSTRUCT gets silently disabled and the build
succeeds!  That happens even without a GCC mismatch.

> This patch is ugly, and not doing the right thing.

Maybe, but I doubt the solution is to make assumptions.
Masahiro Yamada Jan. 25, 2021, 9:44 p.m. UTC | #3
On Tue, Jan 26, 2021 at 6:28 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
>
> On Tue, Jan 26, 2021 at 06:16:01AM +0900, Masahiro Yamada wrote:
> > On Tue, Jan 26, 2021 at 5:42 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> > >
> > > When building out-of-tree kernel modules, the build system doesn't
> > > require the GCC version to match the version used to build the original
> > > kernel.  That's probably [1] fine.
> > >
> > > In fact, for many distros, the version of GCC used to build the latest
> > > kernel doesn't necessarily match the latest released GCC, so a GCC
> > > mismatch turns out to be pretty common.  And with CONFIG_MODVERSIONS
> > > it's probably more common.
> > >
> > > So a lot of users have come to rely on being able to use a different
> > > version of GCC when building OOT modules.
> > >
> > > But with GCC plugins enabled, that's no longer allowed:
> > >
> > >   cc1: error: incompatible gcc/plugin versions
> > >   cc1: error: failed to initialize plugin ./scripts/gcc-plugins/structleak_plugin.so
> > >
> > > That error comes from the plugin's call to
> > > plugin_default_version_check(), which strictly enforces the GCC version.
> > > The strict check makes sense, because there's nothing to prevent the GCC
> > > plugin ABI from changing -- and it often does.
> > >
> > > But failing the build isn't necessary.  For most plugins, OOT modules
> > > will otherwise work just fine without the plugin instrumentation.
> > >
> > > When a GCC version mismatch is detected, print a warning and disable the
> > > plugin.  The only exception is the RANDSTRUCT plugin which needs all
> > > code to see the same struct layouts.  In that case print an error.
> > >
> > > [1] Ignoring, for the moment, that the kernel now has
> > >     toolchain-dependent kconfig options, which can silently disable
> > >     features and cause havoc when compiler versions differ, or even when
> > >     certain libraries are missing.  This is a separate problem which
> > >     also needs to be addressed.
> > >
> > > Reported-by: Ondrej Mosnacek <omosnace@redhat.com>
> > > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> > > ---
> >
> >
> > We are based on the assumption that we use the same
> > compiler for in-tree and out-of-tree.
>
> Sorry, but that assumption isn't based in reality.  And it's not
> enforced.
>
> > If people use a different compiler, they must be
> > prepared for any possible problem.
> >
> > Using different compiler flags for in-tree and out-of-tree
> > is even more dangerous.
> >
> > For example, CONFIG_GCC_PLUGIN_RANDSTRUCT is enabled
> > for in-tree build, and then disabled for out-of-tree modules,
> > the struct layout will mismatch, won't it?
>
> If you read the patch you'll notice that it handles that case, when it's
> caused by GCC mismatch.
>
> However, as alluded to in the [1] footnote, it doesn't handle the case
> where the OOT build system doesn't have gcc-plugin-devel installed.
> Then CONFIG_GCC_PLUGIN_RANDSTRUCT gets silently disabled and the build
> succeeds!  That happens even without a GCC mismatch.


Ah, sorry.

I responded too early before reading the patch fully.

But, I do not like to make RANDSTRUCT a special case.

I'd rather want to stop building for any plugin.







> > This patch is ugly, and not doing the right thing.
>
> Maybe, but I doubt the solution is to make assumptions.
>
> --
> Josh
>
Kees Cook Jan. 25, 2021, 10:03 p.m. UTC | #4
On Mon, Jan 25, 2021 at 02:42:10PM -0600, Josh Poimboeuf wrote:
> When a GCC version mismatch is detected, print a warning and disable the
> plugin.  The only exception is the RANDSTRUCT plugin which needs all
> code to see the same struct layouts.  In that case print an error.

I prefer this patch as-is: only randstruct needs a hard failure. The
others likely work (in fact, randstruct likely works too).

Masahiro, are you suggesting to be a hard-failure for all plugins?
Josh Poimboeuf Jan. 25, 2021, 10:07 p.m. UTC | #5
On Tue, Jan 26, 2021 at 06:44:35AM +0900, Masahiro Yamada wrote:
> > > If people use a different compiler, they must be
> > > prepared for any possible problem.
> > >
> > > Using different compiler flags for in-tree and out-of-tree
> > > is even more dangerous.
> > >
> > > For example, CONFIG_GCC_PLUGIN_RANDSTRUCT is enabled
> > > for in-tree build, and then disabled for out-of-tree modules,
> > > the struct layout will mismatch, won't it?
> >
> > If you read the patch you'll notice that it handles that case, when it's
> > caused by GCC mismatch.
> >
> > However, as alluded to in the [1] footnote, it doesn't handle the case
> > where the OOT build system doesn't have gcc-plugin-devel installed.
> > Then CONFIG_GCC_PLUGIN_RANDSTRUCT gets silently disabled and the build
> > succeeds!  That happens even without a GCC mismatch.
> 
> 
> Ah, sorry.
> 
> I responded too early before reading the patch fully.
> 
> But, I do not like to make RANDSTRUCT a special case.
> 
> I'd rather want to stop building for any plugin.

Other than RANDSTRUCT there doesn't seem to be any problem with
disabling them (and printing a warning) in the OOT build.  Why not give
users that option?  It's harmless, and will make distro's (and their
users') lives easier.

Either GCC mismatch is ok, or it's not.  Let's not half-enforce it.

Also, how do you propose we solve the other problem, where a missing
optional library (gcc-plugin-devel) causes the OOT module build to
silently disable the plugin?  This is related to my earlier complaint
about the dangers of toolchain-dependent kconfig options.
Josh Poimboeuf Jan. 25, 2021, 10:19 p.m. UTC | #6
On Mon, Jan 25, 2021 at 02:03:07PM -0800, Kees Cook wrote:
> On Mon, Jan 25, 2021 at 02:42:10PM -0600, Josh Poimboeuf wrote:
> > When a GCC version mismatch is detected, print a warning and disable the
> > plugin.  The only exception is the RANDSTRUCT plugin which needs all
> > code to see the same struct layouts.  In that case print an error.
> 
> I prefer this patch as-is: only randstruct needs a hard failure. The
> others likely work (in fact, randstruct likely works too).

I'm curious about this last statement, why would randstruct likely work?

Even struct module has '__randomize_layout', wouldn't basic module init
go splat?
Masahiro Yamada Jan. 26, 2021, 1:53 a.m. UTC | #7
On Tue, Jan 26, 2021 at 7:03 AM Kees Cook <keescook@chromium.org> wrote:
>
> On Mon, Jan 25, 2021 at 02:42:10PM -0600, Josh Poimboeuf wrote:
> > When a GCC version mismatch is detected, print a warning and disable the
> > plugin.  The only exception is the RANDSTRUCT plugin which needs all
> > code to see the same struct layouts.  In that case print an error.
>
> I prefer this patch as-is: only randstruct needs a hard failure. The
> others likely work (in fact, randstruct likely works too).
>
> Masahiro, are you suggesting to be a hard-failure for all plugins?

Yes.

I want to require
"I swear to use the same compiler version for external modules"
when you enable GCC plugins.




config CC_VERSION_CHECK_FOR_EXTERNAL_MODULES
        bool "Check the compiler version before building external modules"
        help
           If this option is enabled, the compiler version is checked
           before building external modules. This ensures the same
           compiler is used for the kernel and external modules.


config GCC_PLUGINS
        ...
        depends on CC_VERSION_CHECK_FOR_EXTERNAL_MODULES



In Makefile, check the version for out-of-tree modules
if CONFIG_CC_VERSION_CHECK_FOR_EXTERNAL_MODULES.




There is no difference in the fact that
you cannot use a different compiler for external modules
if CONFIG_GCC_PLUGINS=y.




We started with the assumption that modules must be compiled
by the same compiler as the kernel was.
https://lore.kernel.org/patchwork/patch/836247/#1031547

Now that the compiler capability is evaluated in Kconfig,
this is a harder requirement.

In reality, a different compiler might be used,
and, this requirement might be loosened, but
the same compiler should be required for CONFIG_GCC_PLUGINS.
Greg Kroah-Hartman Jan. 26, 2021, 8:12 a.m. UTC | #8
On Mon, Jan 25, 2021 at 03:27:55PM -0600, Josh Poimboeuf wrote:
> On Tue, Jan 26, 2021 at 06:16:01AM +0900, Masahiro Yamada wrote:
> > On Tue, Jan 26, 2021 at 5:42 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> > >
> > > When building out-of-tree kernel modules, the build system doesn't
> > > require the GCC version to match the version used to build the original
> > > kernel.  That's probably [1] fine.
> > >
> > > In fact, for many distros, the version of GCC used to build the latest
> > > kernel doesn't necessarily match the latest released GCC, so a GCC
> > > mismatch turns out to be pretty common.  And with CONFIG_MODVERSIONS
> > > it's probably more common.
> > >
> > > So a lot of users have come to rely on being able to use a different
> > > version of GCC when building OOT modules.
> > >
> > > But with GCC plugins enabled, that's no longer allowed:
> > >
> > >   cc1: error: incompatible gcc/plugin versions
> > >   cc1: error: failed to initialize plugin ./scripts/gcc-plugins/structleak_plugin.so
> > >
> > > That error comes from the plugin's call to
> > > plugin_default_version_check(), which strictly enforces the GCC version.
> > > The strict check makes sense, because there's nothing to prevent the GCC
> > > plugin ABI from changing -- and it often does.
> > >
> > > But failing the build isn't necessary.  For most plugins, OOT modules
> > > will otherwise work just fine without the plugin instrumentation.
> > >
> > > When a GCC version mismatch is detected, print a warning and disable the
> > > plugin.  The only exception is the RANDSTRUCT plugin which needs all
> > > code to see the same struct layouts.  In that case print an error.
> > >
> > > [1] Ignoring, for the moment, that the kernel now has
> > >     toolchain-dependent kconfig options, which can silently disable
> > >     features and cause havoc when compiler versions differ, or even when
> > >     certain libraries are missing.  This is a separate problem which
> > >     also needs to be addressed.
> > >
> > > Reported-by: Ondrej Mosnacek <omosnace@redhat.com>
> > > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> > > ---
> > 
> > 
> > We are based on the assumption that we use the same
> > compiler for in-tree and out-of-tree.
> 
> Sorry, but that assumption isn't based in reality.  And it's not
> enforced.

It's "enforced" in that if something breaks because of this, no one will
support it :)

We have always said, "all kernel code must be built with the exact same
compiler and with the same build options".  Anyone who does anything
different, is on their own.  So please, let's not change things to make
it as of this might work to hide real problems that are known to show up
when people mix/match compilers with modules.

thanks,

greg k-h
Greg KH Jan. 26, 2021, 8:13 a.m. UTC | #9
On Mon, Jan 25, 2021 at 04:07:57PM -0600, Josh Poimboeuf wrote:
> On Tue, Jan 26, 2021 at 06:44:35AM +0900, Masahiro Yamada wrote:
> > > > If people use a different compiler, they must be
> > > > prepared for any possible problem.
> > > >
> > > > Using different compiler flags for in-tree and out-of-tree
> > > > is even more dangerous.
> > > >
> > > > For example, CONFIG_GCC_PLUGIN_RANDSTRUCT is enabled
> > > > for in-tree build, and then disabled for out-of-tree modules,
> > > > the struct layout will mismatch, won't it?
> > >
> > > If you read the patch you'll notice that it handles that case, when it's
> > > caused by GCC mismatch.
> > >
> > > However, as alluded to in the [1] footnote, it doesn't handle the case
> > > where the OOT build system doesn't have gcc-plugin-devel installed.
> > > Then CONFIG_GCC_PLUGIN_RANDSTRUCT gets silently disabled and the build
> > > succeeds!  That happens even without a GCC mismatch.
> > 
> > 
> > Ah, sorry.
> > 
> > I responded too early before reading the patch fully.
> > 
> > But, I do not like to make RANDSTRUCT a special case.
> > 
> > I'd rather want to stop building for any plugin.
> 
> Other than RANDSTRUCT there doesn't seem to be any problem with
> disabling them (and printing a warning) in the OOT build.  Why not give
> users that option?  It's harmless, and will make distro's (and their
> users') lives easier.
> 
> Either GCC mismatch is ok, or it's not.  Let's not half-enforce it.

As I said earlier, it's not ok, we can not support it at all.

thanks,

greg k-h
David Laight Jan. 26, 2021, 12:16 p.m. UTC | #10
> We started with the assumption that modules must be compiled
> by the same compiler as the kernel was.
> https://lore.kernel.org/patchwork/patch/836247/#1031547
> 
> Now that the compiler capability is evaluated in Kconfig,
> this is a harder requirement.
> 
> In reality, a different compiler might be used,
> and, this requirement might be loosened, but
> the same compiler should be required for CONFIG_GCC_PLUGINS.

Hmmmm.

If I'm building a module to load into a distro kernel
it is very unlikely that I'll actually have the same version
of the compiler installed as that used to build the kernel.

Additionally some of the .o files (that don't refer to any
kernel headers) may have been compiled with an entirely
different compiler altogether.

Whether any plugins are actually installed is another problem.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Justin Forbes Jan. 26, 2021, 12:44 p.m. UTC | #11
On Tue, Jan 26, 2021 at 2:21 AM Greg KH <greg@kroah.com> wrote:
>
> On Mon, Jan 25, 2021 at 04:07:57PM -0600, Josh Poimboeuf wrote:
> > On Tue, Jan 26, 2021 at 06:44:35AM +0900, Masahiro Yamada wrote:
> > > > > If people use a different compiler, they must be
> > > > > prepared for any possible problem.
> > > > >
> > > > > Using different compiler flags for in-tree and out-of-tree
> > > > > is even more dangerous.
> > > > >
> > > > > For example, CONFIG_GCC_PLUGIN_RANDSTRUCT is enabled
> > > > > for in-tree build, and then disabled for out-of-tree modules,
> > > > > the struct layout will mismatch, won't it?
> > > >
> > > > If you read the patch you'll notice that it handles that case, when it's
> > > > caused by GCC mismatch.
> > > >
> > > > However, as alluded to in the [1] footnote, it doesn't handle the case
> > > > where the OOT build system doesn't have gcc-plugin-devel installed.
> > > > Then CONFIG_GCC_PLUGIN_RANDSTRUCT gets silently disabled and the build
> > > > succeeds!  That happens even without a GCC mismatch.
> > >
> > >
> > > Ah, sorry.
> > >
> > > I responded too early before reading the patch fully.
> > >
> > > But, I do not like to make RANDSTRUCT a special case.
> > >
> > > I'd rather want to stop building for any plugin.
> >
> > Other than RANDSTRUCT there doesn't seem to be any problem with
> > disabling them (and printing a warning) in the OOT build.  Why not give
> > users that option?  It's harmless, and will make distro's (and their
> > users') lives easier.
> >
> > Either GCC mismatch is ok, or it's not.  Let's not half-enforce it.
>
> As I said earlier, it's not ok, we can not support it at all.
>

Support and enforce are 2 completely different things.  To shed a bit
more light on this, the real issue that prompted this was breaking CI
systems.  As we enabled gcc plugins in Fedora, and the toolchain folks
went through 3 different snapshots of gcc 11 in a week. Any CI process
that built an out of tree module failed. I don't think this is nearly
as much of a concern for stable distros, as it is for CI in
development cycles.

Justin
Greg KH Jan. 26, 2021, 1:51 p.m. UTC | #12
On Tue, Jan 26, 2021 at 06:44:44AM -0600, Justin Forbes wrote:
> On Tue, Jan 26, 2021 at 2:21 AM Greg KH <greg@kroah.com> wrote:
> >
> > On Mon, Jan 25, 2021 at 04:07:57PM -0600, Josh Poimboeuf wrote:
> > > On Tue, Jan 26, 2021 at 06:44:35AM +0900, Masahiro Yamada wrote:
> > > > > > If people use a different compiler, they must be
> > > > > > prepared for any possible problem.
> > > > > >
> > > > > > Using different compiler flags for in-tree and out-of-tree
> > > > > > is even more dangerous.
> > > > > >
> > > > > > For example, CONFIG_GCC_PLUGIN_RANDSTRUCT is enabled
> > > > > > for in-tree build, and then disabled for out-of-tree modules,
> > > > > > the struct layout will mismatch, won't it?
> > > > >
> > > > > If you read the patch you'll notice that it handles that case, when it's
> > > > > caused by GCC mismatch.
> > > > >
> > > > > However, as alluded to in the [1] footnote, it doesn't handle the case
> > > > > where the OOT build system doesn't have gcc-plugin-devel installed.
> > > > > Then CONFIG_GCC_PLUGIN_RANDSTRUCT gets silently disabled and the build
> > > > > succeeds!  That happens even without a GCC mismatch.
> > > >
> > > >
> > > > Ah, sorry.
> > > >
> > > > I responded too early before reading the patch fully.
> > > >
> > > > But, I do not like to make RANDSTRUCT a special case.
> > > >
> > > > I'd rather want to stop building for any plugin.
> > >
> > > Other than RANDSTRUCT there doesn't seem to be any problem with
> > > disabling them (and printing a warning) in the OOT build.  Why not give
> > > users that option?  It's harmless, and will make distro's (and their
> > > users') lives easier.
> > >
> > > Either GCC mismatch is ok, or it's not.  Let's not half-enforce it.
> >
> > As I said earlier, it's not ok, we can not support it at all.
> >
> 
> Support and enforce are 2 completely different things.  To shed a bit
> more light on this, the real issue that prompted this was breaking CI
> systems.  As we enabled gcc plugins in Fedora, and the toolchain folks
> went through 3 different snapshots of gcc 11 in a week. Any CI process
> that built an out of tree module failed. I don't think this is nearly
> as much of a concern for stable distros, as it is for CI in
> development cycles.

It's better to have an obvious break like this than to silently accept
things and then have a much harder issue to debug at runtime, right?

Don't allow things that we know we will not support, this sounds like an
issue with your CI systems, not with the kernel build system, why not
just fix that?  :)

thnaks,

greg k-h
Josh Poimboeuf Jan. 26, 2021, 2:51 p.m. UTC | #13
On Tue, Jan 26, 2021 at 02:51:29PM +0100, Greg KH wrote:
> On Tue, Jan 26, 2021 at 06:44:44AM -0600, Justin Forbes wrote:
> > On Tue, Jan 26, 2021 at 2:21 AM Greg KH <greg@kroah.com> wrote:
> > >
> > > On Mon, Jan 25, 2021 at 04:07:57PM -0600, Josh Poimboeuf wrote:
> > > > On Tue, Jan 26, 2021 at 06:44:35AM +0900, Masahiro Yamada wrote:
> > > > > > > If people use a different compiler, they must be
> > > > > > > prepared for any possible problem.
> > > > > > >
> > > > > > > Using different compiler flags for in-tree and out-of-tree
> > > > > > > is even more dangerous.
> > > > > > >
> > > > > > > For example, CONFIG_GCC_PLUGIN_RANDSTRUCT is enabled
> > > > > > > for in-tree build, and then disabled for out-of-tree modules,
> > > > > > > the struct layout will mismatch, won't it?
> > > > > >
> > > > > > If you read the patch you'll notice that it handles that case, when it's
> > > > > > caused by GCC mismatch.
> > > > > >
> > > > > > However, as alluded to in the [1] footnote, it doesn't handle the case
> > > > > > where the OOT build system doesn't have gcc-plugin-devel installed.
> > > > > > Then CONFIG_GCC_PLUGIN_RANDSTRUCT gets silently disabled and the build
> > > > > > succeeds!  That happens even without a GCC mismatch.
> > > > >
> > > > >
> > > > > Ah, sorry.
> > > > >
> > > > > I responded too early before reading the patch fully.
> > > > >
> > > > > But, I do not like to make RANDSTRUCT a special case.
> > > > >
> > > > > I'd rather want to stop building for any plugin.
> > > >
> > > > Other than RANDSTRUCT there doesn't seem to be any problem with
> > > > disabling them (and printing a warning) in the OOT build.  Why not give
> > > > users that option?  It's harmless, and will make distro's (and their
> > > > users') lives easier.
> > > >
> > > > Either GCC mismatch is ok, or it's not.  Let's not half-enforce it.
> > >
> > > As I said earlier, it's not ok, we can not support it at all.
> > >
> > 
> > Support and enforce are 2 completely different things.  To shed a bit
> > more light on this, the real issue that prompted this was breaking CI
> > systems.  As we enabled gcc plugins in Fedora, and the toolchain folks
> > went through 3 different snapshots of gcc 11 in a week. Any CI process
> > that built an out of tree module failed. I don't think this is nearly
> > as much of a concern for stable distros, as it is for CI in
> > development cycles.
> 
> It's better to have an obvious break like this than to silently accept
> things and then have a much harder issue to debug at runtime, right?

User space mixes compiler versions all the time.  The C ABI is stable.

What specifically is the harder issue you're referring to?
Greg KH Jan. 26, 2021, 3 p.m. UTC | #14
On Tue, Jan 26, 2021 at 08:51:55AM -0600, Josh Poimboeuf wrote:
> On Tue, Jan 26, 2021 at 02:51:29PM +0100, Greg KH wrote:
> > On Tue, Jan 26, 2021 at 06:44:44AM -0600, Justin Forbes wrote:
> > > On Tue, Jan 26, 2021 at 2:21 AM Greg KH <greg@kroah.com> wrote:
> > > >
> > > > On Mon, Jan 25, 2021 at 04:07:57PM -0600, Josh Poimboeuf wrote:
> > > > > On Tue, Jan 26, 2021 at 06:44:35AM +0900, Masahiro Yamada wrote:
> > > > > > > > If people use a different compiler, they must be
> > > > > > > > prepared for any possible problem.
> > > > > > > >
> > > > > > > > Using different compiler flags for in-tree and out-of-tree
> > > > > > > > is even more dangerous.
> > > > > > > >
> > > > > > > > For example, CONFIG_GCC_PLUGIN_RANDSTRUCT is enabled
> > > > > > > > for in-tree build, and then disabled for out-of-tree modules,
> > > > > > > > the struct layout will mismatch, won't it?
> > > > > > >
> > > > > > > If you read the patch you'll notice that it handles that case, when it's
> > > > > > > caused by GCC mismatch.
> > > > > > >
> > > > > > > However, as alluded to in the [1] footnote, it doesn't handle the case
> > > > > > > where the OOT build system doesn't have gcc-plugin-devel installed.
> > > > > > > Then CONFIG_GCC_PLUGIN_RANDSTRUCT gets silently disabled and the build
> > > > > > > succeeds!  That happens even without a GCC mismatch.
> > > > > >
> > > > > >
> > > > > > Ah, sorry.
> > > > > >
> > > > > > I responded too early before reading the patch fully.
> > > > > >
> > > > > > But, I do not like to make RANDSTRUCT a special case.
> > > > > >
> > > > > > I'd rather want to stop building for any plugin.
> > > > >
> > > > > Other than RANDSTRUCT there doesn't seem to be any problem with
> > > > > disabling them (and printing a warning) in the OOT build.  Why not give
> > > > > users that option?  It's harmless, and will make distro's (and their
> > > > > users') lives easier.
> > > > >
> > > > > Either GCC mismatch is ok, or it's not.  Let's not half-enforce it.
> > > >
> > > > As I said earlier, it's not ok, we can not support it at all.
> > > >
> > > 
> > > Support and enforce are 2 completely different things.  To shed a bit
> > > more light on this, the real issue that prompted this was breaking CI
> > > systems.  As we enabled gcc plugins in Fedora, and the toolchain folks
> > > went through 3 different snapshots of gcc 11 in a week. Any CI process
> > > that built an out of tree module failed. I don't think this is nearly
> > > as much of a concern for stable distros, as it is for CI in
> > > development cycles.
> > 
> > It's better to have an obvious break like this than to silently accept
> > things and then have a much harder issue to debug at runtime, right?
> 
> User space mixes compiler versions all the time.  The C ABI is stable.
> 
> What specifically is the harder issue you're referring to?

Have you not noticed include/linux/compiler.h and all of the different
changes/workarounds we do for different versions of gcc/clang/intel
compilers?  We have never guaranteed that a kernel module would work
that was built with a different compiler than the main kernel, and I
doubt we can start now.

thanks,

greg k-h
Peter Zijlstra Jan. 26, 2021, 3:15 p.m. UTC | #15
On Tue, Jan 26, 2021 at 08:51:55AM -0600, Josh Poimboeuf wrote:
> User space mixes compiler versions all the time.  The C ABI is stable.
> 
> What specifically is the harder issue you're referring to?

I don't think the C ABI captures nearly enough. Imagine trying to mix a
compiler with and without asm-goto support (ok, we fail to build without
by now, but just imagine).

No C ABI violated, but having that GCC extention vs not having it
radically changes the kernel ABI.

I think I'm with Greg here, just don't do it.
Josh Poimboeuf Jan. 26, 2021, 3:46 p.m. UTC | #16
On Tue, Jan 26, 2021 at 04:15:37PM +0100, Peter Zijlstra wrote:
> On Tue, Jan 26, 2021 at 08:51:55AM -0600, Josh Poimboeuf wrote:
> > User space mixes compiler versions all the time.  The C ABI is stable.
> > 
> > What specifically is the harder issue you're referring to?
> 
> I don't think the C ABI captures nearly enough. Imagine trying to mix a
> compiler with and without asm-goto support (ok, we fail to build without
> by now, but just imagine).
> 
> No C ABI violated, but having that GCC extention vs not having it
> radically changes the kernel ABI.
> 
> I think I'm with Greg here, just don't do it.

Ok, thank you for an actual example.  asm goto is a good one.

But it's not a cut-and-dry issue.  Otherwise how could modversions
possibly work?

So yes, we should enforce GCC versions, but I still haven't seen a
reason it should be more than just "same compiler and *major* version".
Peter Zijlstra Jan. 26, 2021, 4:05 p.m. UTC | #17
On Tue, Jan 26, 2021 at 09:46:51AM -0600, Josh Poimboeuf wrote:
> On Tue, Jan 26, 2021 at 04:15:37PM +0100, Peter Zijlstra wrote:
> > On Tue, Jan 26, 2021 at 08:51:55AM -0600, Josh Poimboeuf wrote:
> > > User space mixes compiler versions all the time.  The C ABI is stable.
> > > 
> > > What specifically is the harder issue you're referring to?
> > 
> > I don't think the C ABI captures nearly enough. Imagine trying to mix a
> > compiler with and without asm-goto support (ok, we fail to build without
> > by now, but just imagine).
> > 
> > No C ABI violated, but having that GCC extention vs not having it
> > radically changes the kernel ABI.
> > 
> > I think I'm with Greg here, just don't do it.
> 
> Ok, thank you for an actual example.  asm goto is a good one.
> 
> But it's not a cut-and-dry issue.  Otherwise how could modversions
> possibly work?
> 
> So yes, we should enforce GCC versions, but I still haven't seen a
> reason it should be more than just "same compiler and *major* version".

Why bother? rebuilding the kernel and all modules is a matter of 10
minutes at most on a decently beefy build box.

What actual problem are we trying to solve here?
Justin Forbes Jan. 26, 2021, 4:15 p.m. UTC | #18
On Tue, Jan 26, 2021 at 10:05 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Tue, Jan 26, 2021 at 09:46:51AM -0600, Josh Poimboeuf wrote:
> > On Tue, Jan 26, 2021 at 04:15:37PM +0100, Peter Zijlstra wrote:
> > > On Tue, Jan 26, 2021 at 08:51:55AM -0600, Josh Poimboeuf wrote:
> > > > User space mixes compiler versions all the time.  The C ABI is stable.
> > > >
> > > > What specifically is the harder issue you're referring to?
> > >
> > > I don't think the C ABI captures nearly enough. Imagine trying to mix a
> > > compiler with and without asm-goto support (ok, we fail to build without
> > > by now, but just imagine).
> > >
> > > No C ABI violated, but having that GCC extention vs not having it
> > > radically changes the kernel ABI.
> > >
> > > I think I'm with Greg here, just don't do it.
> >
> > Ok, thank you for an actual example.  asm goto is a good one.
> >
> > But it's not a cut-and-dry issue.  Otherwise how could modversions
> > possibly work?
> >
> > So yes, we should enforce GCC versions, but I still haven't seen a
> > reason it should be more than just "same compiler and *major* version".
>
> Why bother? rebuilding the kernel and all modules is a matter of 10
> minutes at most on a decently beefy build box.
>
> What actual problem are we trying to solve here?



This is true for those of us used to working with source and building
by hand. For users who want everything packaged, rebuilding a kernel
package for install is considerably longer, and for distros providing
builds for multiple arches, we are looking at a couple of hours at
best.  From a Fedora standpoint, I am perfectly fine with it failing
if someone tries to build a module on gcc10 when the kernel was built
with gcc11.  It's tedious when the kernel was built with gcc11
yesterday, and a new gcc11 build today means that kernel needs to be
rebuilt.
Josh Poimboeuf Jan. 26, 2021, 4:19 p.m. UTC | #19
On Tue, Jan 26, 2021 at 10:15:52AM -0600, Justin Forbes wrote:
> On Tue, Jan 26, 2021 at 10:05 AM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Tue, Jan 26, 2021 at 09:46:51AM -0600, Josh Poimboeuf wrote:
> > > On Tue, Jan 26, 2021 at 04:15:37PM +0100, Peter Zijlstra wrote:
> > > > On Tue, Jan 26, 2021 at 08:51:55AM -0600, Josh Poimboeuf wrote:
> > > > > User space mixes compiler versions all the time.  The C ABI is stable.
> > > > >
> > > > > What specifically is the harder issue you're referring to?
> > > >
> > > > I don't think the C ABI captures nearly enough. Imagine trying to mix a
> > > > compiler with and without asm-goto support (ok, we fail to build without
> > > > by now, but just imagine).
> > > >
> > > > No C ABI violated, but having that GCC extention vs not having it
> > > > radically changes the kernel ABI.
> > > >
> > > > I think I'm with Greg here, just don't do it.
> > >
> > > Ok, thank you for an actual example.  asm goto is a good one.
> > >
> > > But it's not a cut-and-dry issue.  Otherwise how could modversions
> > > possibly work?
> > >
> > > So yes, we should enforce GCC versions, but I still haven't seen a
> > > reason it should be more than just "same compiler and *major* version".
> >
> > Why bother? rebuilding the kernel and all modules is a matter of 10
> > minutes at most on a decently beefy build box.
> >
> > What actual problem are we trying to solve here?
> 
> This is true for those of us used to working with source and building
> by hand. For users who want everything packaged, rebuilding a kernel
> package for install is considerably longer, and for distros providing
> builds for multiple arches, we are looking at a couple of hours at
> best.  From a Fedora standpoint, I am perfectly fine with it failing
> if someone tries to build a module on gcc10 when the kernel was built
> with gcc11.  It's tedious when the kernel was built with gcc11
> yesterday, and a new gcc11 build today means that kernel needs to be
> rebuilt.

Right.  It's a problem for distro users.  The compiler and kernel are in
separate packages, with separate release cadences.  The latest compiler
version may not exactly match what was used to build the latest kernel.
David Laight Jan. 26, 2021, 4:22 p.m. UTC | #20
From: Peter Zijlstra <peterz@infradead.org>
> Sent: 26 January 2021 16:05
> 
> On Tue, Jan 26, 2021 at 09:46:51AM -0600, Josh Poimboeuf wrote:
> > On Tue, Jan 26, 2021 at 04:15:37PM +0100, Peter Zijlstra wrote:
> > > On Tue, Jan 26, 2021 at 08:51:55AM -0600, Josh Poimboeuf wrote:
> > > > User space mixes compiler versions all the time.  The C ABI is stable.
> > > >
> > > > What specifically is the harder issue you're referring to?
> > >
> > > I don't think the C ABI captures nearly enough. Imagine trying to mix a
> > > compiler with and without asm-goto support (ok, we fail to build without
> > > by now, but just imagine).
> > >
> > > No C ABI violated, but having that GCC extention vs not having it
> > > radically changes the kernel ABI.
> > >
> > > I think I'm with Greg here, just don't do it.
> >
> > Ok, thank you for an actual example.  asm goto is a good one.
> >
> > But it's not a cut-and-dry issue.  Otherwise how could modversions
> > possibly work?
> >
> > So yes, we should enforce GCC versions, but I still haven't seen a
> > reason it should be more than just "same compiler and *major* version".
> 
> Why bother? rebuilding the kernel and all modules is a matter of 10
> minutes at most on a decently beefy build box.
> 
> What actual problem are we trying to solve here?

People build modules to load into disrto-provided kernels.

I'd have though the compiler would need to support the same options
as that used to build the kernel - but not necessarily be exactly
the same version.

Ignoring compiler bugs (which will bight you if the compiler you
have is broken) then a newer compiler ought to be fine.
Or a kernel might have been built with config options that don't
require features of the compiler being used - so that modules can
be built with an older compiler.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Greg KH Jan. 26, 2021, 5:06 p.m. UTC | #21
On Tue, Jan 26, 2021 at 10:19:34AM -0600, Josh Poimboeuf wrote:
> On Tue, Jan 26, 2021 at 10:15:52AM -0600, Justin Forbes wrote:
> > On Tue, Jan 26, 2021 at 10:05 AM Peter Zijlstra <peterz@infradead.org> wrote:
> > >
> > > On Tue, Jan 26, 2021 at 09:46:51AM -0600, Josh Poimboeuf wrote:
> > > > On Tue, Jan 26, 2021 at 04:15:37PM +0100, Peter Zijlstra wrote:
> > > > > On Tue, Jan 26, 2021 at 08:51:55AM -0600, Josh Poimboeuf wrote:
> > > > > > User space mixes compiler versions all the time.  The C ABI is stable.
> > > > > >
> > > > > > What specifically is the harder issue you're referring to?
> > > > >
> > > > > I don't think the C ABI captures nearly enough. Imagine trying to mix a
> > > > > compiler with and without asm-goto support (ok, we fail to build without
> > > > > by now, but just imagine).
> > > > >
> > > > > No C ABI violated, but having that GCC extention vs not having it
> > > > > radically changes the kernel ABI.
> > > > >
> > > > > I think I'm with Greg here, just don't do it.
> > > >
> > > > Ok, thank you for an actual example.  asm goto is a good one.
> > > >
> > > > But it's not a cut-and-dry issue.  Otherwise how could modversions
> > > > possibly work?
> > > >
> > > > So yes, we should enforce GCC versions, but I still haven't seen a
> > > > reason it should be more than just "same compiler and *major* version".
> > >
> > > Why bother? rebuilding the kernel and all modules is a matter of 10
> > > minutes at most on a decently beefy build box.
> > >
> > > What actual problem are we trying to solve here?
> > 
> > This is true for those of us used to working with source and building
> > by hand. For users who want everything packaged, rebuilding a kernel
> > package for install is considerably longer, and for distros providing
> > builds for multiple arches, we are looking at a couple of hours at
> > best.  From a Fedora standpoint, I am perfectly fine with it failing
> > if someone tries to build a module on gcc10 when the kernel was built
> > with gcc11.  It's tedious when the kernel was built with gcc11
> > yesterday, and a new gcc11 build today means that kernel needs to be
> > rebuilt.
> 
> Right.  It's a problem for distro users.  The compiler and kernel are in
> separate packages, with separate release cadences.  The latest compiler
> version may not exactly match what was used to build the latest kernel.

Given that distros _should_ be updating their kernel faster than the
compiler updates, what's the real issue here?  You build a kernel, and
all external modules, at the same time.  If you want to build them at
different times, you make your build system ensure they were the same
compiler so that you are "bug compatible".

And yes, it might be a pain if gcc11 gets updated every other day, but
as someone living with a "rolling-distro" you get used to it, otherwise
you just "pin" the build tools and keep that from happening.

This isn't a new thing, we've been doing this for decades, why is this
surprising?

thanks,

greg k-h
Justin Forbes Jan. 26, 2021, 5:47 p.m. UTC | #22
On Tue, Jan 26, 2021 at 11:07 AM Greg KH <greg@kroah.com> wrote:
>
> On Tue, Jan 26, 2021 at 10:19:34AM -0600, Josh Poimboeuf wrote:
> > On Tue, Jan 26, 2021 at 10:15:52AM -0600, Justin Forbes wrote:
> > > On Tue, Jan 26, 2021 at 10:05 AM Peter Zijlstra <peterz@infradead.org> wrote:
> > > >
> > > > On Tue, Jan 26, 2021 at 09:46:51AM -0600, Josh Poimboeuf wrote:
> > > > > On Tue, Jan 26, 2021 at 04:15:37PM +0100, Peter Zijlstra wrote:
> > > > > > On Tue, Jan 26, 2021 at 08:51:55AM -0600, Josh Poimboeuf wrote:
> > > > > > > User space mixes compiler versions all the time.  The C ABI is stable.
> > > > > > >
> > > > > > > What specifically is the harder issue you're referring to?
> > > > > >
> > > > > > I don't think the C ABI captures nearly enough. Imagine trying to mix a
> > > > > > compiler with and without asm-goto support (ok, we fail to build without
> > > > > > by now, but just imagine).
> > > > > >
> > > > > > No C ABI violated, but having that GCC extention vs not having it
> > > > > > radically changes the kernel ABI.
> > > > > >
> > > > > > I think I'm with Greg here, just don't do it.
> > > > >
> > > > > Ok, thank you for an actual example.  asm goto is a good one.
> > > > >
> > > > > But it's not a cut-and-dry issue.  Otherwise how could modversions
> > > > > possibly work?
> > > > >
> > > > > So yes, we should enforce GCC versions, but I still haven't seen a
> > > > > reason it should be more than just "same compiler and *major* version".
> > > >
> > > > Why bother? rebuilding the kernel and all modules is a matter of 10
> > > > minutes at most on a decently beefy build box.
> > > >
> > > > What actual problem are we trying to solve here?
> > >
> > > This is true for those of us used to working with source and building
> > > by hand. For users who want everything packaged, rebuilding a kernel
> > > package for install is considerably longer, and for distros providing
> > > builds for multiple arches, we are looking at a couple of hours at
> > > best.  From a Fedora standpoint, I am perfectly fine with it failing
> > > if someone tries to build a module on gcc10 when the kernel was built
> > > with gcc11.  It's tedious when the kernel was built with gcc11
> > > yesterday, and a new gcc11 build today means that kernel needs to be
> > > rebuilt.
> >
> > Right.  It's a problem for distro users.  The compiler and kernel are in
> > separate packages, with separate release cadences.  The latest compiler
> > version may not exactly match what was used to build the latest kernel.
>
> Given that distros _should_ be updating their kernel faster than the
> compiler updates, what's the real issue here?  You build a kernel, and
> all external modules, at the same time.  If you want to build them at
> different times, you make your build system ensure they were the same
> compiler so that you are "bug compatible".
>
> And yes, it might be a pain if gcc11 gets updated every other day, but
> as someone living with a "rolling-distro" you get used to it, otherwise
> you just "pin" the build tools and keep that from happening.
>
> This isn't a new thing, we've been doing this for decades, why is this
> surprising?

We definitely build considerably more kernels than toolchains. From a
rawhide standpoint though, a number of testers are willing to test RC
releases, but are not willing to run debug kernels, so they installed
rc4 yesterday, but will not install today's snapshot.  I will build
3-5 new kernels before they update to rc5.  We have been doing things
this way for over a decade. It has never been a problem until we
turned on CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF_ALL.  Suddenly I am
getting complaints.
Kees Cook Jan. 26, 2021, 5:56 p.m. UTC | #23
On Mon, Jan 25, 2021 at 04:19:53PM -0600, Josh Poimboeuf wrote:
> On Mon, Jan 25, 2021 at 02:03:07PM -0800, Kees Cook wrote:
> > On Mon, Jan 25, 2021 at 02:42:10PM -0600, Josh Poimboeuf wrote:
> > > When a GCC version mismatch is detected, print a warning and disable the
> > > plugin.  The only exception is the RANDSTRUCT plugin which needs all
> > > code to see the same struct layouts.  In that case print an error.
> > 
> > I prefer this patch as-is: only randstruct needs a hard failure. The
> > others likely work (in fact, randstruct likely works too).
> 
> I'm curious about this last statement, why would randstruct likely work?
> 
> Even struct module has '__randomize_layout', wouldn't basic module init
> go splat?

No; the seed is part of the generate includes -- you'll get the same
layout with the same seed.
Josh Poimboeuf Jan. 26, 2021, 6:43 p.m. UTC | #24
On Tue, Jan 26, 2021 at 09:56:10AM -0800, Kees Cook wrote:
> On Mon, Jan 25, 2021 at 04:19:53PM -0600, Josh Poimboeuf wrote:
> > On Mon, Jan 25, 2021 at 02:03:07PM -0800, Kees Cook wrote:
> > > On Mon, Jan 25, 2021 at 02:42:10PM -0600, Josh Poimboeuf wrote:
> > > > When a GCC version mismatch is detected, print a warning and disable the
> > > > plugin.  The only exception is the RANDSTRUCT plugin which needs all
> > > > code to see the same struct layouts.  In that case print an error.
> > > 
> > > I prefer this patch as-is: only randstruct needs a hard failure. The
> > > others likely work (in fact, randstruct likely works too).
> > 
> > I'm curious about this last statement, why would randstruct likely work?
> > 
> > Even struct module has '__randomize_layout', wouldn't basic module init
> > go splat?
> 
> No; the seed is part of the generate includes -- you'll get the same
> layout with the same seed.

Right, but don't you need the plugin enabled to make use of that seed,
so the structs get interpreted properly by the module?  Or am I
completely misunderstanding how this plugin works?
Kees Cook Jan. 26, 2021, 10:59 p.m. UTC | #25
On Tue, Jan 26, 2021 at 12:43:16PM -0600, Josh Poimboeuf wrote:
> On Tue, Jan 26, 2021 at 09:56:10AM -0800, Kees Cook wrote:
> > On Mon, Jan 25, 2021 at 04:19:53PM -0600, Josh Poimboeuf wrote:
> > > On Mon, Jan 25, 2021 at 02:03:07PM -0800, Kees Cook wrote:
> > > > On Mon, Jan 25, 2021 at 02:42:10PM -0600, Josh Poimboeuf wrote:
> > > > > When a GCC version mismatch is detected, print a warning and disable the
> > > > > plugin.  The only exception is the RANDSTRUCT plugin which needs all
> > > > > code to see the same struct layouts.  In that case print an error.
> > > > 
> > > > I prefer this patch as-is: only randstruct needs a hard failure. The
> > > > others likely work (in fact, randstruct likely works too).
> > > 
> > > I'm curious about this last statement, why would randstruct likely work?
> > > 
> > > Even struct module has '__randomize_layout', wouldn't basic module init
> > > go splat?
> > 
> > No; the seed is part of the generate includes -- you'll get the same
> > layout with the same seed.
> 
> Right, but don't you need the plugin enabled to make use of that seed,
> so the structs get interpreted properly by the module?  Or am I
> completely misunderstanding how this plugin works?

Having the plugin enabled or not is part of the Kconfig ... you can't
build anything if you change Kconfig. I feel like I'm missing
something...
Josh Poimboeuf Jan. 26, 2021, 11:32 p.m. UTC | #26
On Tue, Jan 26, 2021 at 02:59:57PM -0800, Kees Cook wrote:
> On Tue, Jan 26, 2021 at 12:43:16PM -0600, Josh Poimboeuf wrote:
> > On Tue, Jan 26, 2021 at 09:56:10AM -0800, Kees Cook wrote:
> > > On Mon, Jan 25, 2021 at 04:19:53PM -0600, Josh Poimboeuf wrote:
> > > > On Mon, Jan 25, 2021 at 02:03:07PM -0800, Kees Cook wrote:
> > > > > On Mon, Jan 25, 2021 at 02:42:10PM -0600, Josh Poimboeuf wrote:
> > > > > > When a GCC version mismatch is detected, print a warning and disable the
> > > > > > plugin.  The only exception is the RANDSTRUCT plugin which needs all
> > > > > > code to see the same struct layouts.  In that case print an error.
> > > > > 
> > > > > I prefer this patch as-is: only randstruct needs a hard failure. The
> > > > > others likely work (in fact, randstruct likely works too).
> > > > 
> > > > I'm curious about this last statement, why would randstruct likely work?
> > > > 
> > > > Even struct module has '__randomize_layout', wouldn't basic module init
> > > > go splat?
> > > 
> > > No; the seed is part of the generate includes -- you'll get the same
> > > layout with the same seed.
> > 
> > Right, but don't you need the plugin enabled to make use of that seed,
> > so the structs get interpreted properly by the module?  Or am I
> > completely misunderstanding how this plugin works?
> 
> Having the plugin enabled or not is part of the Kconfig ... you can't
> build anything if you change Kconfig. I feel like I'm missing
> something...

I guess we crossed wires somehow.  Backing up :-)

The patch disables plugins when there's a GCC mismatch in the OOT module
build, with the exception of RANDSTRUCT, for which it just errors out.

When you said "randstruct likely works too" I thought you meant that
RANDSTRUCT would likely work even if it were disabled in the OOT module
build (i.e. if we removed the RANDSTRUCT special case from the patch).

Or did you mean something else?  Like using RANDSTRUCT with a different
version of GCC would likely work?

(I'm definitely not proposing we allow GCC mismatches for plugins, as I
was told that plugins can break from one build to the next).
Christoph Hellwig Jan. 27, 2021, 6:02 p.m. UTC | #27
Please don't add all this garbage.  We only add infrastructure to the
kernel for what the kernel itself needs, not for weird out of tree
infrastructure.

On Mon, Jan 25, 2021 at 02:42:10PM -0600, Josh Poimboeuf wrote:
> When building out-of-tree kernel modules, the build system doesn't
> require the GCC version to match the version used to build the original
> kernel.  That's probably [1] fine.
> 
> In fact, for many distros, the version of GCC used to build the latest
> kernel doesn't necessarily match the latest released GCC, so a GCC
> mismatch turns out to be pretty common.  And with CONFIG_MODVERSIONS
> it's probably more common.
> 
> So a lot of users have come to rely on being able to use a different
> version of GCC when building OOT modules.
> 
> But with GCC plugins enabled, that's no longer allowed:
> 
>   cc1: error: incompatible gcc/plugin versions
>   cc1: error: failed to initialize plugin ./scripts/gcc-plugins/structleak_plugin.so
> 
> That error comes from the plugin's call to
> plugin_default_version_check(), which strictly enforces the GCC version.
> The strict check makes sense, because there's nothing to prevent the GCC
> plugin ABI from changing -- and it often does.
> 
> But failing the build isn't necessary.  For most plugins, OOT modules
> will otherwise work just fine without the plugin instrumentation.
> 
> When a GCC version mismatch is detected, print a warning and disable the
> plugin.  The only exception is the RANDSTRUCT plugin which needs all
> code to see the same struct layouts.  In that case print an error.
> 
> [1] Ignoring, for the moment, that the kernel now has
>     toolchain-dependent kconfig options, which can silently disable
>     features and cause havoc when compiler versions differ, or even when
>     certain libraries are missing.  This is a separate problem which
>     also needs to be addressed.
> 
> Reported-by: Ondrej Mosnacek <omosnace@redhat.com>
> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
> ---
>  scripts/Makefile.gcc-plugins | 19 +++++++++++++++++++
>  scripts/Makefile.kcov        | 11 +++++++++++
>  2 files changed, 30 insertions(+)
> 
> diff --git a/scripts/Makefile.gcc-plugins b/scripts/Makefile.gcc-plugins
> index 952e46876329..7227692fba59 100644
> --- a/scripts/Makefile.gcc-plugins
> +++ b/scripts/Makefile.gcc-plugins
> @@ -51,6 +51,25 @@ export DISABLE_ARM_SSP_PER_TASK_PLUGIN
>  GCC_PLUGINS_CFLAGS := $(strip $(addprefix -fplugin=$(objtree)/scripts/gcc-plugins/, $(gcc-plugin-y)) $(gcc-plugin-cflags-y))
>  # The sancov_plugin.so is included via CFLAGS_KCOV, so it is removed here.
>  GCC_PLUGINS_CFLAGS := $(filter-out %/sancov_plugin.so, $(GCC_PLUGINS_CFLAGS))
> +
> +# Out-of-tree module check: If there's a GCC version mismatch, disable plugins
> +# and print a warning.  Otherwise the OOT module build will fail due to
> +# plugin_default_version_check().
> +ifneq ($(GCC_PLUGINS_CFLAGS),)
> +    ifneq ($(KBUILD_EXTMOD),)
> +        ifneq ($(CONFIG_GCC_VERSION), $(shell $(srctree)/scripts/gcc-version.sh $(HOSTCXX)))
> +
> +            ifdef CONFIG_GCC_PLUGIN_RANDSTRUCT
> +                $(error error: CONFIG_GCC_PLUGIN_RANDSTRUCT requires out-of-tree modules to be built using the same GCC version as the kernel.)
> +            endif
> +
> +            $(warning warning: Disabling GCC plugins for out-of-tree modules due to GCC version mismatch.)
> +            $(warning warning: The following plugins have been disabled: $(gcc-plugin-y))
> +            GCC_PLUGINS_CFLAGS :=
> +	endif
> +    endif
> +endif
> +
>  export GCC_PLUGINS_CFLAGS
>  
>  # Add the flags to the build!
> diff --git a/scripts/Makefile.kcov b/scripts/Makefile.kcov
> index 67e8cfe3474b..63a2bc2aabb2 100644
> --- a/scripts/Makefile.kcov
> +++ b/scripts/Makefile.kcov
> @@ -3,4 +3,15 @@ kcov-flags-$(CONFIG_CC_HAS_SANCOV_TRACE_PC)	+= -fsanitize-coverage=trace-pc
>  kcov-flags-$(CONFIG_KCOV_ENABLE_COMPARISONS)	+= -fsanitize-coverage=trace-cmp
>  kcov-flags-$(CONFIG_GCC_PLUGIN_SANCOV)		+= -fplugin=$(objtree)/scripts/gcc-plugins/sancov_plugin.so
>  
> +# Out-of-tree module check for GCC version mismatch.
> +# See the similar check in scripts/Makefile.gcc-plugins
> +ifneq ($(CONFIG_GCC_PLUGIN_SANCOV),)
> +    ifneq ($(KBUILD_EXTMOD),)
> +        ifneq ($(CONFIG_GCC_VERSION), $(shell $(srctree)/scripts/gcc-version.sh $(HOSTCXX)))
> +            $(warning warning: Disabling CONFIG_GCC_PLUGIN_SANCOV for out-of-tree modules due to GCC version mismatch.)
> +            kcov-flags-y := $(filter-out %/sancov_plugin.so, $(kcov-flags-y))
> +        endif
> +    endif
> +endif
> +
>  export CFLAGS_KCOV := $(kcov-flags-y)
> -- 
> 2.29.2
> 
---end quoted text---
Christoph Hellwig Jan. 27, 2021, 6:03 p.m. UTC | #28
On Tue, Jan 26, 2021 at 06:44:44AM -0600, Justin Forbes wrote:
> Support and enforce are 2 completely different things.  To shed a bit
> more light on this, the real issue that prompted this was breaking CI
> systems.  As we enabled gcc plugins in Fedora, and the toolchain folks
> went through 3 different snapshots of gcc 11 in a week. Any CI process
> that built an out of tree module failed. I don't think this is nearly
> as much of a concern for stable distros, as it is for CI in
> development cycles.

And the answer is trivial: don't build out of tree modules.
Josh Poimboeuf Jan. 27, 2021, 6:38 p.m. UTC | #29
On Wed, Jan 27, 2021 at 06:02:15PM +0000, Christoph Hellwig wrote:
> Please don't add all this garbage.  We only add infrastructure to the
> kernel for what the kernel itself needs, not for weird out of tree
> infrastructure.

This isn't new, the kernel already has the infrastructure for building
out-of-tree modules.  It's widely used.  Are you suggesting we remove
it?  Good luck with that...

Either it should be supported, or not.  Make the case either way.  But I
can't understand why people are advocating to leave it half-broken.
Christoph Hellwig Jan. 27, 2021, 6:43 p.m. UTC | #30
On Wed, Jan 27, 2021 at 12:38:56PM -0600, Josh Poimboeuf wrote:
> On Wed, Jan 27, 2021 at 06:02:15PM +0000, Christoph Hellwig wrote:
> > Please don't add all this garbage.  We only add infrastructure to the
> > kernel for what the kernel itself needs, not for weird out of tree
> > infrastructure.
> 
> This isn't new, the kernel already has the infrastructure for building
> out-of-tree modules.  It's widely used.  Are you suggesting we remove
> it?  Good luck with that...
> 
> Either it should be supported, or not.  Make the case either way.  But I
> can't understand why people are advocating to leave it half-broken.


It is not support as any kind of interface.  It is a little aid for
local development.  Adding any kond of complexities for out of tree
modules is a complete no-go.
Josh Poimboeuf Jan. 27, 2021, 6:51 p.m. UTC | #31
On Wed, Jan 27, 2021 at 06:43:27PM +0000, Christoph Hellwig wrote:
> On Wed, Jan 27, 2021 at 12:38:56PM -0600, Josh Poimboeuf wrote:
> > On Wed, Jan 27, 2021 at 06:02:15PM +0000, Christoph Hellwig wrote:
> > > Please don't add all this garbage.  We only add infrastructure to the
> > > kernel for what the kernel itself needs, not for weird out of tree
> > > infrastructure.
> > 
> > This isn't new, the kernel already has the infrastructure for building
> > out-of-tree modules.  It's widely used.  Are you suggesting we remove
> > it?  Good luck with that...
> > 
> > Either it should be supported, or not.  Make the case either way.  But I
> > can't understand why people are advocating to leave it half-broken.
> 
> 
> It is not support as any kind of interface.  It is a little aid for
> local development.

Is this a joke?  I've never met anybody who builds OOT modules as a
development aid...

On the other hand I know of several very popular distros (some paid,
some not) who rely on allowing users/partners to build OOT modules as
part of their ecosystem.  To say it's not supported is a farce.
David Laight Jan. 27, 2021, 10:09 p.m. UTC | #32
From: Josh Poimboeuf
> Sent: 27 January 2021 18:51
> 
> On Wed, Jan 27, 2021 at 06:43:27PM +0000, Christoph Hellwig wrote:
> > On Wed, Jan 27, 2021 at 12:38:56PM -0600, Josh Poimboeuf wrote:
> > > On Wed, Jan 27, 2021 at 06:02:15PM +0000, Christoph Hellwig wrote:
> > > > Please don't add all this garbage.  We only add infrastructure to the
> > > > kernel for what the kernel itself needs, not for weird out of tree
> > > > infrastructure.
> > >
> > > This isn't new, the kernel already has the infrastructure for building
> > > out-of-tree modules.  It's widely used.  Are you suggesting we remove
> > > it?  Good luck with that...
> > >
> > > Either it should be supported, or not.  Make the case either way.  But I
> > > can't understand why people are advocating to leave it half-broken.
> >
> >
> > It is not support as any kind of interface.  It is a little aid for
> > local development.
> 
> Is this a joke?  I've never met anybody who builds OOT modules as a
> development aid...
> 
> On the other hand I know of several very popular distros (some paid,
> some not) who rely on allowing users/partners to build OOT modules as
> part of their ecosystem.  To say it's not supported is a farce.

Indeed there are plenty of companies who provide kernel modules
(wholly or partly in source form) for their customers to build as OOT
modules to install in distro built kernels.

These modules have to compile against everything from RHEL6 (2.6.32 base)
through to the current -rc release.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Christoph Hellwig Jan. 28, 2021, 2:29 p.m. UTC | #33
On Wed, Jan 27, 2021 at 12:51:13PM -0600, Josh Poimboeuf wrote:
> Is this a joke?  I've never met anybody who builds OOT modules as a
> development aid...

I'm pretty sure you've met me before.

> On the other hand I know of several very popular distros (some paid,
> some not) who rely on allowing users/partners to build OOT modules as
> part of their ecosystem.  To say it's not supported is a farce.

This is not a farce.  The kernel only supports infrastructure for the
kernel itself, not for any external consumers.  If you have a business
model that relies on something else you should think hard if you are in
the right business.
Josh Poimboeuf Jan. 28, 2021, 3:45 p.m. UTC | #34
On Thu, Jan 28, 2021 at 02:29:52PM +0000, Christoph Hellwig wrote:
> On Wed, Jan 27, 2021 at 12:51:13PM -0600, Josh Poimboeuf wrote:
> > Is this a joke?  I've never met anybody who builds OOT modules as a
> > development aid...
> 
> I'm pretty sure you've met me before.

Yes, but I don't recall this topic coming up :-)

> > On the other hand I know of several very popular distros (some paid,
> > some not) who rely on allowing users/partners to build OOT modules as
> > part of their ecosystem.  To say it's not supported is a farce.
> 
> This is not a farce.  The kernel only supports infrastructure for the
> kernel itself, not for any external consumers.  If you have a business
> model that relies on something else you should think hard if you are in
> the right business.

Thanks for letting me know.  I'll send the memo to the billions of
affected devices.
Josh Poimboeuf March 2, 2021, 11:26 p.m. UTC | #35
On Mon, Jan 25, 2021 at 02:42:10PM -0600, Josh Poimboeuf wrote:
> When building out-of-tree kernel modules, the build system doesn't
> require the GCC version to match the version used to build the original
> kernel.  That's probably [1] fine.
> 
> In fact, for many distros, the version of GCC used to build the latest
> kernel doesn't necessarily match the latest released GCC, so a GCC
> mismatch turns out to be pretty common.  And with CONFIG_MODVERSIONS
> it's probably more common.
> 
> So a lot of users have come to rely on being able to use a different
> version of GCC when building OOT modules.
> 
> But with GCC plugins enabled, that's no longer allowed:
> 
>   cc1: error: incompatible gcc/plugin versions
>   cc1: error: failed to initialize plugin ./scripts/gcc-plugins/structleak_plugin.so
> 
> That error comes from the plugin's call to
> plugin_default_version_check(), which strictly enforces the GCC version.
> The strict check makes sense, because there's nothing to prevent the GCC
> plugin ABI from changing -- and it often does.
> 
> But failing the build isn't necessary.  For most plugins, OOT modules
> will otherwise work just fine without the plugin instrumentation.
> 
> When a GCC version mismatch is detected, print a warning and disable the
> plugin.  The only exception is the RANDSTRUCT plugin which needs all
> code to see the same struct layouts.  In that case print an error.

Hi Masahiro,

This problem is becoming more prevalent.  We will need to fix it one way
or another, if we want to support distro adoption of these GCC
plugin-based features.

Frank suggested a possibly better idea: always rebuild the plugins when
the GCC version changes.  What do you think?  Any suggestions on how to
implement that?  Otherwise I can try to hack something together.
Masahiro Yamada March 3, 2021, 6:49 p.m. UTC | #36
On Wed, Mar 3, 2021 at 8:27 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
>
> On Mon, Jan 25, 2021 at 02:42:10PM -0600, Josh Poimboeuf wrote:
> > When building out-of-tree kernel modules, the build system doesn't
> > require the GCC version to match the version used to build the original
> > kernel.  That's probably [1] fine.
> >
> > In fact, for many distros, the version of GCC used to build the latest
> > kernel doesn't necessarily match the latest released GCC, so a GCC
> > mismatch turns out to be pretty common.  And with CONFIG_MODVERSIONS
> > it's probably more common.
> >
> > So a lot of users have come to rely on being able to use a different
> > version of GCC when building OOT modules.
> >
> > But with GCC plugins enabled, that's no longer allowed:
> >
> >   cc1: error: incompatible gcc/plugin versions
> >   cc1: error: failed to initialize plugin ./scripts/gcc-plugins/structleak_plugin.so
> >
> > That error comes from the plugin's call to
> > plugin_default_version_check(), which strictly enforces the GCC version.
> > The strict check makes sense, because there's nothing to prevent the GCC
> > plugin ABI from changing -- and it often does.
> >
> > But failing the build isn't necessary.  For most plugins, OOT modules
> > will otherwise work just fine without the plugin instrumentation.
> >
> > When a GCC version mismatch is detected, print a warning and disable the
> > plugin.  The only exception is the RANDSTRUCT plugin which needs all
> > code to see the same struct layouts.  In that case print an error.
>
> Hi Masahiro,
>
> This problem is becoming more prevalent.  We will need to fix it one way
> or another, if we want to support distro adoption of these GCC
> plugin-based features.
>
> Frank suggested a possibly better idea: always rebuild the plugins when
> the GCC version changes.


That is just another form of the previous patch,
which was already rejected.


- That is a hack just for external modules
- Our consensus is, use the same version for the kernel and external modules



I use Ubuntu, and I do not see such a problem.
(I have never seen it on Debian either, except sid.)

I see Fedora providing GCC whose version is different
from the one used for building the kernel.
That is a problem on the distribution side.



In my ubuntu.

$ grep CC_VERSION_TEXT   /lib/modules/$(uname -r)/build/.config
CONFIG_CC_VERSION_TEXT="gcc (Ubuntu 10.2.0-13ubuntu1) 10.2.0"
$ gcc --version  | head -n1
gcc (Ubuntu 10.2.0-13ubuntu1) 10.2.0



> What do you think?

NACK.


>  Any suggestions on how to
> implement that?  Otherwise I can try to hack something together.
>
> --
> Josh
>
--
Best Regards
Masahiro Yamada
Josh Poimboeuf March 3, 2021, 7:15 p.m. UTC | #37
On Thu, Mar 04, 2021 at 03:49:35AM +0900, Masahiro Yamada wrote:
> > This problem is becoming more prevalent.  We will need to fix it one way
> > or another, if we want to support distro adoption of these GCC
> > plugin-based features.
> >
> > Frank suggested a possibly better idea: always rebuild the plugins when
> > the GCC version changes.
> 
> 
> That is just another form of the previous patch,
> which was already rejected.
> 
> 
> - That is a hack just for external modules
> - Our consensus is, use the same version for the kernel and external modules
> 
> 
> 
> I use Ubuntu, and I do not see such a problem.
> (I have never seen it on Debian either, except sid.)
> 
> I see Fedora providing GCC whose version is different
> from the one used for building the kernel.
> That is a problem on the distribution side.

I don't understand.  Are you suggesting that a distro should always
release GCC and kernel updates simultaneously?

How is this problem specific to Fedora?  Please be specific about what
Ubuntu and Debian do, which Fedora doesn't do.

Adding Linus, who indicated in another thread that we shouldn't force
exact GCC versions because there's no technical reason to do so.
Linus Torvalds March 3, 2021, 7:25 p.m. UTC | #38
On Wed, Mar 3, 2021 at 11:15 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
>
> Adding Linus, who indicated in another thread that we shouldn't force
> exact GCC versions because there's no technical reason to do so.

I do not believe we should recompile everything just because the gcc
version changes.

But gcc _plugins_ certainly should depend on the kernel version.

Very few people should be enabling the gcc plugins in the first place.
Honestly, most of them are bad, and the people who really care about
those things have already moved to clang which does the important
parts natively without the need for a plugin. I'm personally waiting
for the day when we can just say "let's remove them". But in the
meantime, making the plugins depend on the gcc version some way is
certainly better than not doing so.

                  Linus
Josh Poimboeuf March 3, 2021, 7:38 p.m. UTC | #39
On Wed, Mar 03, 2021 at 11:25:34AM -0800, Linus Torvalds wrote:
> On Wed, Mar 3, 2021 at 11:15 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> >
> > Adding Linus, who indicated in another thread that we shouldn't force
> > exact GCC versions because there's no technical reason to do so.
> 
> I do not believe we should recompile everything just because the gcc
> version changes.
> 
> But gcc _plugins_ certainly should depend on the kernel version.
> 
> Very few people should be enabling the gcc plugins in the first place.
> Honestly, most of them are bad, and the people who really care about
> those things have already moved to clang which does the important
> parts natively without the need for a plugin. I'm personally waiting
> for the day when we can just say "let's remove them".

You might be sad to learn that some of the plugins are useful for
hardening of a production distro kernel, like stackleak and structleak.

> But in the meantime, making the plugins depend on the gcc version some
> way is certainly better than not doing so.

So currently, the plugins already so that.  They require the GCC version
to be exact.  If there's a mismatch, then it fails the OOT module build.

But that's not usable for a distro.  When users build OOT modules with a
slight GCC mismatch, it breaks the build, effectively requiring the
exact same GCC version for *all* OOT builds going forward.

So there have been a few proposals to better handle GCC version
mismatches:

1) disable the plugin - this works fine for most plugins except
   randstruct

2) rebuild the plugin whenever the GCC version changes

3) fail the build, like today - effectively blocks distros from using
   plugins
Linus Torvalds March 3, 2021, 7:57 p.m. UTC | #40
On Wed, Mar 3, 2021 at 11:38 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
>
> > But in the meantime, making the plugins depend on the gcc version some
> > way is certainly better than not doing so.
>
> So currently, the plugins already so that.  They require the GCC version
> to be exact.  If there's a mismatch, then it fails the OOT module build.

That's not my experience.

Yes, the build fails, but it fails not by _rebuilding_, but by failing
with an error.

IOW, it's a dependency problem.

That said, I absolutely think that distros that think that stackleak
is an important plugin should seriously have a plan to just move to
clang, or push the gcc people to just add the
"-ftrivial-auto-var-init" thing (and talk sense to the clang people
who think that zero isn't good, and want to force a "pattern", but
happily they haven't taken over the world yet).

The kernel gcc plugins _will_ go away eventually. They are an
unmitigated disaster. They always have been. I'm sorry I ever merged
that support. It's not only a maintenance nightmare, it's just a
horrible thing and interface in the first place. It's literally BAD
TECHNOLOGY.

Gcc plugins were badly done. They _should_ have been done twenty years
ago as a proper IR (and people very much asked for them), but for
political reasons the FSF was very much against any kind of
intermediate representation that could be hooked into. It's one of the
reasons clang has been so successful - having the whole LLVM IR model
has made life _so_ much better for anybody working on any kind of
compiler that it's not even funny.

Gcc plugins were too little, too late, and are not even remotely a
good model technically. LLVM did things right with a well-defined IR
front and center, and while I dearly love gcc for a lot of reasons, I
absolutely despise how badly gcc handled this all - and I despise how
that horrible decision was never about technology, and was always due
to bad politics on the part of FSF and rms.

End result: gcc plugins are pure garbage, and you should shun them. If
you really believe you need compiler plugins, you should look at
clang.

That really is the only sane technical answer.

             Linus
Josh Poimboeuf March 3, 2021, 8:24 p.m. UTC | #41
On Wed, Mar 03, 2021 at 11:57:33AM -0800, Linus Torvalds wrote:
> On Wed, Mar 3, 2021 at 11:38 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> >
> > > But in the meantime, making the plugins depend on the gcc version some
> > > way is certainly better than not doing so.
> >
> > So currently, the plugins already so that.  They require the GCC version
> > to be exact.  If there's a mismatch, then it fails the OOT module build.
> 
> That's not my experience.
> 
> Yes, the build fails, but it fails not by _rebuilding_, but by failing
> with an error.

Um, that's what I said.  It does not rebuild.  It fails with an error.

The *proposal* is to rebuild the plugin -- which Masahiro nacked because
he claims GCC mismatches aren't supported for OOT builds (plugin or
not).

Your nack is for a different reason: GCC plugins are second-class
citizens.  Fair enough...
Josh Poimboeuf March 3, 2021, 8:31 p.m. UTC | #42
On Wed, Mar 03, 2021 at 02:24:12PM -0600, Josh Poimboeuf wrote:
> On Wed, Mar 03, 2021 at 11:57:33AM -0800, Linus Torvalds wrote:
> > On Wed, Mar 3, 2021 at 11:38 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> > >
> > > > But in the meantime, making the plugins depend on the gcc version some
> > > > way is certainly better than not doing so.
> > >
> > > So currently, the plugins already so that.  They require the GCC version
> > > to be exact.  If there's a mismatch, then it fails the OOT module build.
> > 
> > That's not my experience.
> > 
> > Yes, the build fails, but it fails not by _rebuilding_, but by failing
> > with an error.
> 
> Um, that's what I said.  It does not rebuild.  It fails with an error.
> 
> The *proposal* is to rebuild the plugin -- which Masahiro nacked because
> he claims GCC mismatches aren't supported for OOT builds (plugin or
> not).
> 
> Your nack is for a different reason: GCC plugins are second-class
> citizens.  Fair enough...

Or was it a nack? :-)  Reading your message again, I may have
misinterpreted.  Put simply, should we rebuild plugins when the GCC
version changes?
Linus Torvalds March 3, 2021, 8:56 p.m. UTC | #43
On Wed, Mar 3, 2021 at 12:24 PM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
>
> Your nack is for a different reason: GCC plugins are second-class
> citizens.  Fair enough...

MNo, I didn't NAK it. Quite the reverser.

I am ABSOLUTELY against rebuilding normal object files just because
gcc versions change. A compiler version change makes zero difference
for any normal object file.

But the gcc plugins are different. They very much _are_ tied to a
particular gcc version.

Now, they are tied to a particular gcc version because they are
horribly badly done, and bad technology, and I went off on a bit of a
rant about just how bad they are, but the point is that gcc plugins
depend on the exact gcc version in ways that normal object files do
_not_.

               Linus
Josh Poimboeuf March 3, 2021, 9:45 p.m. UTC | #44
On Wed, Mar 03, 2021 at 12:56:52PM -0800, Linus Torvalds wrote:
> On Wed, Mar 3, 2021 at 12:24 PM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> >
> > Your nack is for a different reason: GCC plugins are second-class
> > citizens.  Fair enough...
> 
> MNo, I didn't NAK it. Quite the reverser.
> 
> I am ABSOLUTELY against rebuilding normal object files just because
> gcc versions change. A compiler version change makes zero difference
> for any normal object file.
> 
> But the gcc plugins are different. They very much _are_ tied to a
> particular gcc version.
> 
> Now, they are tied to a particular gcc version because they are
> horribly badly done, and bad technology, and I went off on a bit of a
> rant about just how bad they are, but the point is that gcc plugins
> depend on the exact gcc version in ways that normal object files do
> _not_.

Thanks, reading comprehension is hard.  I realized after re-reading that
I interpreted your "plugins should depend on the kernel version"
statement too broadly.

Masahiro, any idea how I can make the GCC version a build dependency?
Kees Cook March 3, 2021, 9:52 p.m. UTC | #45
On Wed, Mar 03, 2021 at 11:57:33AM -0800, Linus Torvalds wrote:
> End result: gcc plugins are pure garbage, and you should shun them. If

I think that's pretty harsh, but okay, opinions are opinions. As Josh
says, people are interested in them for not-uncommon real-world uses:

- stackleak has data-lifetime wiping coverage that -ftrivial-auto-var-init=zero
  for either Clang or GCC doesn't cover. There are no plans to provide
  such coverage under Clang yet. It's arguable that stackleak's benefits
  are smaller than it's maintenance burden compared to having
  -ftrivial-auto-var-init=zero, but we can have that conversation when
  we get there. :)

- structleak is likely to vanish as soon as GCC supports
  -ftrivial-auto-var-init=zero. Clang's implementation is done and in
  use by every Clang-built kernel I know of.

- latent_entropy is likely less useful since the jitter entropy was added,
  but I've not seen anyone analyze it. No "upstream" GCC nor Clang support
  is planned.

- arm32 per-task stack protector canary meaningfully reduces the risk of
  stack content exposures vs stack frame overwrites, but neither GCC
  nor Clang seem interested in implementing this "correctly" (as done
  for arm64 on GCC -- Clang doesn't have this for arm64 yet). I want this
  fixed for arm64 on Clang, and maybe arm32 can be done at the same time.

- randstruct is likely not used for distro kernels, but very much for
  end users where security has a giant priority over performance.
  There's no "upstream" GCC plan for this, and the Clang support has
  stalled.

> you really believe you need compiler plugins, you should look at
> clang.

This is currently true only in 1 case (structleak), and only a few
"traditional" distros are building the kernel with Clang right now. I
don't disagree: doing this via LLVM IR would be much easier, but the
implementations for the other above features don't exist yet. I expect
this to change over time (I expect Clang's randstruct and GCC's
-ftrivial-auto-var-init=zero to likely be the next two things to
appear), but it's not the case right now.
Masahiro Yamada March 4, 2021, 12:26 p.m. UTC | #46
On Thu, Mar 4, 2021 at 4:15 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
>
> On Thu, Mar 04, 2021 at 03:49:35AM +0900, Masahiro Yamada wrote:
> > > This problem is becoming more prevalent.  We will need to fix it one way
> > > or another, if we want to support distro adoption of these GCC
> > > plugin-based features.
> > >
> > > Frank suggested a possibly better idea: always rebuild the plugins when
> > > the GCC version changes.
> >
> >
> > That is just another form of the previous patch,
> > which was already rejected.
> >
> >
> > - That is a hack just for external modules
> > - Our consensus is, use the same version for the kernel and external modules
> >
> >
> >
> > I use Ubuntu, and I do not see such a problem.
> > (I have never seen it on Debian either, except sid.)
> >
> > I see Fedora providing GCC whose version is different
> > from the one used for building the kernel.
> > That is a problem on the distribution side.
>
> I don't understand.  Are you suggesting that a distro should always
> release GCC and kernel updates simultaneously?

Well, as Greg says, given that GCC is less frequently upgraded
than the kernel, this should not be a big deal.

https://lore.kernel.org/lkml/YBBML2eB%2FIXcTvcn@kroah.com/



> How is this problem specific to Fedora?  Please be specific about what
> Ubuntu and Debian do, which Fedora doesn't do.

I do not know what they exactly do. Looking at only the result,
Ubuntu and Debian do proper engineering, which Fedora doesn't.




>
> Adding Linus, who indicated in another thread that we shouldn't force
> exact GCC versions because there's no technical reason to do so.
>
> --
> Josh
>
Masahiro Yamada March 4, 2021, 12:27 p.m. UTC | #47
On Thu, Mar 4, 2021 at 6:45 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
>
> On Wed, Mar 03, 2021 at 12:56:52PM -0800, Linus Torvalds wrote:
> > On Wed, Mar 3, 2021 at 12:24 PM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> > >
> > > Your nack is for a different reason: GCC plugins are second-class
> > > citizens.  Fair enough...
> >
> > MNo, I didn't NAK it. Quite the reverser.
> >
> > I am ABSOLUTELY against rebuilding normal object files just because
> > gcc versions change. A compiler version change makes zero difference
> > for any normal object file.
> >
> > But the gcc plugins are different. They very much _are_ tied to a
> > particular gcc version.
> >
> > Now, they are tied to a particular gcc version because they are
> > horribly badly done, and bad technology, and I went off on a bit of a
> > rant about just how bad they are, but the point is that gcc plugins
> > depend on the exact gcc version in ways that normal object files do
> > _not_.
>
> Thanks, reading comprehension is hard.  I realized after re-reading that
> I interpreted your "plugins should depend on the kernel version"
> statement too broadly.
>
> Masahiro, any idea how I can make the GCC version a build dependency?


I agree with rebuilding GCC plugins when the compiler is upgraded
for *in-tree* building.
Linus had reported it a couple of months before,
and I just submitted a very easy fix.

Rebuilding plugins for external modules is not easy;
plugins are placed in the read-only directory,
/usr/src/linux-headers-$(uname -r)/scripts/gcc-plugins/.

The external modules must not (cannot) update in-tree
build artifacts.  "Rebuild" means creating copies in a different
writable directory.
Doing that requires a lot of design changes.
Josh Poimboeuf March 4, 2021, 3:08 p.m. UTC | #48
On Thu, Mar 04, 2021 at 09:27:28PM +0900, Masahiro Yamada wrote:
> I agree with rebuilding GCC plugins when the compiler is upgraded
> for *in-tree* building.
> Linus had reported it a couple of months before,
> and I just submitted a very easy fix.

Hm?  So does that mean that a GCC version change won't trigger a
tree-wide rebuild?  So you're asserting that a GCC mismatch is ok for
in-tree code, but not for external modules???  That seems backwards.

For in-tree, why not just rebuild the entire tree?  Some kernel features
are dependent on compiler version or capability, so not rebuilding the
tree could introduce silent breakage.

For external modules, a tree-wide rebuild isn't an option so the risk is
assumed by the user.  I posted a patch earlier [1] which prints a
warning if the compiler major/minor version changes with an external
module build.

[1] https://lkml.kernel.org/r/20210201211322.t2rxmvnrystc2ky7@treble

> Rebuilding plugins for external modules is not easy;
> plugins are placed in the read-only directory,
> /usr/src/linux-headers-$(uname -r)/scripts/gcc-plugins/.
> 
> The external modules must not (cannot) update in-tree
> build artifacts.  "Rebuild" means creating copies in a different
> writable directory.
> Doing that requires a lot of design changes.

Ok.  So it sounds like the best/easiest option is the original patch in
this thread:  when building an external module with a GCC mismatch, just
disable the GCC plugin, with a warning (or an error for randstruct).
Masahiro Yamada March 4, 2021, 3:35 p.m. UTC | #49
On Fri, Mar 5, 2021 at 12:08 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
>
> On Thu, Mar 04, 2021 at 09:27:28PM +0900, Masahiro Yamada wrote:
> > I agree with rebuilding GCC plugins when the compiler is upgraded
> > for *in-tree* building.
> > Linus had reported it a couple of months before,
> > and I just submitted a very easy fix.
>
> Hm?  So does that mean that a GCC version change won't trigger a
> tree-wide rebuild?  So you're asserting that a GCC mismatch is ok for
> in-tree code, but not for external modules???  That seems backwards.
>
> For in-tree, why not just rebuild the entire tree?  Some kernel features
> are dependent on compiler version or capability, so not rebuilding the
> tree could introduce silent breakage.



All the kernel-space objects are rebuilt
when the compiler is upgraded.
(See commit 8b59cd81dc5e724eaea283fa6006985891c7bff4)


Linus complaint about GCC plugins not being rebuilt.
                      ^^^^^^^^^^^

https://lore.kernel.org/lkml/CAHk-=wieoN5ttOy7SnsGwZv+Fni3R6m-Ut=oxih6bbZ28G+4dw@mail.gmail.com/


That is easy to fix. I submitted a patch:
https://patchwork.kernel.org/project/linux-kbuild/patch/20210304113708.215121-1-masahiroy@kernel.org/




> For external modules, a tree-wide rebuild isn't an option so the risk is
> assumed by the user.  I posted a patch earlier [1] which prints a
> warning if the compiler major/minor version changes with an external
> module build.
>
> [1] https://lkml.kernel.org/r/20210201211322.t2rxmvnrystc2ky7@treble
>
> > Rebuilding plugins for external modules is not easy;
> > plugins are placed in the read-only directory,
> > /usr/src/linux-headers-$(uname -r)/scripts/gcc-plugins/.
> >
> > The external modules must not (cannot) update in-tree
> > build artifacts.  "Rebuild" means creating copies in a different
> > writable directory.
> > Doing that requires a lot of design changes.
>
> Ok.  So it sounds like the best/easiest option is the original patch in
> this thread:  when building an external module with a GCC mismatch, just
> disable the GCC plugin, with a warning (or an error for randstruct).

It was rejected.


If a distribution wants to enable CONFIG_GCC_PLUGINS,
it must provide GCC whose version is the same as
used for building the kernel.

If a distribution cannot manage release in that way,
do not enable CONFIG_GCC_PLUGINS in the first place.
Linus Torvalds March 4, 2021, 7:12 p.m. UTC | #50
On Thu, Mar 4, 2021 at 7:36 AM Masahiro Yamada <masahiroy@kernel.org> wrote:
>
> All the kernel-space objects are rebuilt
> when the compiler is upgraded.

I very much NAK'ed that one. Why did that go in?

Or maybe I NAK'ed another version of it (I think the one I NAK'ed was
from Josh), and didn't realize that there were multiple ones.

> Linus complaint about GCC plugins not being rebuilt.

Yes, and that was a separate complaint and not at all tied to the other objects.

Gcc plugins aren't kernel object files at all. They are very much like
dynamically loadable libraries to gcc itself.

               Linus
Josh Poimboeuf March 5, 2021, 2:41 a.m. UTC | #51
On Thu, Mar 04, 2021 at 11:12:42AM -0800, Linus Torvalds wrote:
> On Thu, Mar 4, 2021 at 7:36 AM Masahiro Yamada <masahiroy@kernel.org> wrote:
> >
> > All the kernel-space objects are rebuilt
> > when the compiler is upgraded.
> 
> I very much NAK'ed that one. Why did that go in?
> 
> Or maybe I NAK'ed another version of it (I think the one I NAK'ed was
> from Josh), and didn't realize that there were multiple ones.

This thread is making me dizzy, but I think the patch you NAK'ed from me
was different.  It just added an error on GCC mismatch with external
modules:

  https://lkml.kernel.org/r/fff056a7c9e6050c2d60910f70b6d99602f3bec4.1611863075.git.jpoimboe@redhat.com

though I think you were ok with downgrading it to a warning.
Linus Torvalds March 5, 2021, 2:49 a.m. UTC | #52
On Thu, Mar 4, 2021 at 6:41 PM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
>
> This thread is making me dizzy, but I think the patch you NAK'ed from me
> was different.  It just added an error on GCC mismatch with external
> modules:
>
>   https://lkml.kernel.org/r/fff056a7c9e6050c2d60910f70b6d99602f3bec4.1611863075.git.jpoimboe@redhat.com
>
> though I think you were ok with downgrading it to a warning.

Right you are. So that was about the external module thing, which is
obviously incidentally an issue here with the plugin behavior too, but
different.

My inbox is kind of a modern-day Colossal Cave adventure: "You are in
a maze of twisty email threads, all similar but with different hidden
details".

               Linus
Masahiro Yamada March 5, 2021, 3:31 p.m. UTC | #53
On Fri, Mar 5, 2021 at 4:13 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Thu, Mar 4, 2021 at 7:36 AM Masahiro Yamada <masahiroy@kernel.org> wrote:
> >
> > All the kernel-space objects are rebuilt
> > when the compiler is upgraded.
>
> I very much NAK'ed that one. Why did that go in?


When the compiler is upgraded, all objects
should be rebuilt by the new compiler,
- this keeps Kbuild deterministic,
irrespective of whether it is a fresh build,
or incremental build.


If we do not force the full rebuild,
the banner at boot time is no point.

[    0.000000] Linux version 5.8.0-44-generic (buildd@lgw01-amd64-039)
(gcc (Ubuntu 10.2.0-13ubuntu1) 10.2.0, GNU ld (GNU Binutils for
Ubuntu) 2.35.1) #50-Ubuntu SMP Tue Feb 9 06:29:41 UTC 2021 (Ubuntu
5.8.0-44.50-generic 5.8.18)


It claims it was built by
gcc (Ubuntu 10.2.0-13ubuntu1) 10.2.0
but we would never know if it is true
for whole objects.
Some of them might have been compiled by
an older compiler.




> Or maybe I NAK'ed another version of it (I think the one I NAK'ed was
> from Josh), and didn't realize that there were multiple ones.
>
> > Linus complaint about GCC plugins not being rebuilt.
>
> Yes, and that was a separate complaint and not at all tied to the other objects.
>
> Gcc plugins aren't kernel object files at all. They are very much like
> dynamically loadable libraries to gcc itself.
>
>                Linus
Masahiro Yamada March 5, 2021, 4:03 p.m. UTC | #54
On Fri, Mar 5, 2021 at 11:41 AM Josh Poimboeuf <jpoimboe@redhat.com> wrote:
>
> On Thu, Mar 04, 2021 at 11:12:42AM -0800, Linus Torvalds wrote:
> > On Thu, Mar 4, 2021 at 7:36 AM Masahiro Yamada <masahiroy@kernel.org> wrote:
> > >
> > > All the kernel-space objects are rebuilt
> > > when the compiler is upgraded.
> >
> > I very much NAK'ed that one. Why did that go in?
> >
> > Or maybe I NAK'ed another version of it (I think the one I NAK'ed was
> > from Josh), and didn't realize that there were multiple ones.
>
> This thread is making me dizzy,

Me too.

> Ok.  So it sounds like the best/easiest option is the original patch in
> this thread:  when building an external module with a GCC mismatch, just
> disable the GCC plugin, with a warning (or an error for randstruct).

Just for clarification,
I believe "the original patch" pointed to this one:
https://lore.kernel.org/lkml/efe6b039a544da8215d5e54aa7c4b6d1986fc2b0.1611607264.git.jpoimboe@redhat.com/

This is dead. Please do not come back to this.
See negative comments not only from me,
but also from Greg, Peter, Christoph.



> but I think the patch you NAK'ed from me
> was different.  It just added an error on GCC mismatch with external
> modules:
>
>   https://lkml.kernel.org/r/fff056a7c9e6050c2d60910f70b6d99602f3bec4.1611863075.git.jpoimboe@redhat.com
>
> though I think you were ok with downgrading it to a warning.

I think this was not NAKed.

At first, Linus NAKed it, but I think it was
just because of some misunderstanding.

I do not have an objection about checking
compiler version difference for external module
builds.

I was pointing out some mistakes in the Makefile implementation.
(I would say "not that way" even with your second trial).


--
Best Regards
Masahiro Yamada
Josh Poimboeuf March 5, 2021, 7:18 p.m. UTC | #55
On Sat, Mar 06, 2021 at 01:03:32AM +0900, Masahiro Yamada wrote:
> > Ok.  So it sounds like the best/easiest option is the original patch in
> > this thread:  when building an external module with a GCC mismatch, just
> > disable the GCC plugin, with a warning (or an error for randstruct).
> 
> Just for clarification,
> I believe "the original patch" pointed to this one:
> https://lore.kernel.org/lkml/efe6b039a544da8215d5e54aa7c4b6d1986fc2b0.1611607264.git.jpoimboe@redhat.com/
> 
> This is dead. Please do not come back to this.

Sorry, no.  The patch may have been crap, but that doesn't make the
problem I'm trying to solve any less valid.

> See negative comments not only from me, but also from Greg, Peter,
> Christoph.

I responded to those.  Summarizing my replies once again...


- "External modules aren't supported"

  This doesn't even remotely match reality.  Are you honestly using this
  "negative comment" as a reason to NAK the patch?


- "External modules must be built with the same GCC version"

  As has been stated repeatedly, by Linus and others, there's no
  technical reason behind this claim.  It ignores the realities of how
  distros release the kernel and compiler independently, with separate
  cadences.  Minor variances in compiler version are ABI compatible.

  Also, for features which are dependent on compiler version, many of
  those are now enabled by kbuild.  As I suggested to you previously,
  kbuild should warn when such features get disabled (which can happen
  due to a compiler/toolchain change or due to a .config copied from
  another system).
David Laight March 8, 2021, 9:39 a.m. UTC | #56
> - "External modules must be built with the same GCC version"
> 
>   As has been stated repeatedly, by Linus and others, there's no
>   technical reason behind this claim.  It ignores the realities of how
>   distros release the kernel and compiler independently, with separate
>   cadences.  Minor variances in compiler version are ABI compatible.

Aren't major versions ABI compatible?
Otherwise it is a different architecture.

>   Also, for features which are dependent on compiler version, many of
>   those are now enabled by kbuild.  As I suggested to you previously,
>   kbuild should warn when such features get disabled (which can happen
>   due to a compiler/toolchain change or due to a .config copied from
>   another system).

Anything that is just an optimisation (or pessimisation) really
doesn't matter.
AFAICT most of these options are in the latter category.

One think people might do is download a kernel from kernel.org
and then build an extra module for it.
The compiler they have will be completely different.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
diff mbox series

Patch

diff --git a/scripts/Makefile.gcc-plugins b/scripts/Makefile.gcc-plugins
index 952e46876329..7227692fba59 100644
--- a/scripts/Makefile.gcc-plugins
+++ b/scripts/Makefile.gcc-plugins
@@ -51,6 +51,25 @@  export DISABLE_ARM_SSP_PER_TASK_PLUGIN
 GCC_PLUGINS_CFLAGS := $(strip $(addprefix -fplugin=$(objtree)/scripts/gcc-plugins/, $(gcc-plugin-y)) $(gcc-plugin-cflags-y))
 # The sancov_plugin.so is included via CFLAGS_KCOV, so it is removed here.
 GCC_PLUGINS_CFLAGS := $(filter-out %/sancov_plugin.so, $(GCC_PLUGINS_CFLAGS))
+
+# Out-of-tree module check: If there's a GCC version mismatch, disable plugins
+# and print a warning.  Otherwise the OOT module build will fail due to
+# plugin_default_version_check().
+ifneq ($(GCC_PLUGINS_CFLAGS),)
+    ifneq ($(KBUILD_EXTMOD),)
+        ifneq ($(CONFIG_GCC_VERSION), $(shell $(srctree)/scripts/gcc-version.sh $(HOSTCXX)))
+
+            ifdef CONFIG_GCC_PLUGIN_RANDSTRUCT
+                $(error error: CONFIG_GCC_PLUGIN_RANDSTRUCT requires out-of-tree modules to be built using the same GCC version as the kernel.)
+            endif
+
+            $(warning warning: Disabling GCC plugins for out-of-tree modules due to GCC version mismatch.)
+            $(warning warning: The following plugins have been disabled: $(gcc-plugin-y))
+            GCC_PLUGINS_CFLAGS :=
+	endif
+    endif
+endif
+
 export GCC_PLUGINS_CFLAGS
 
 # Add the flags to the build!
diff --git a/scripts/Makefile.kcov b/scripts/Makefile.kcov
index 67e8cfe3474b..63a2bc2aabb2 100644
--- a/scripts/Makefile.kcov
+++ b/scripts/Makefile.kcov
@@ -3,4 +3,15 @@  kcov-flags-$(CONFIG_CC_HAS_SANCOV_TRACE_PC)	+= -fsanitize-coverage=trace-pc
 kcov-flags-$(CONFIG_KCOV_ENABLE_COMPARISONS)	+= -fsanitize-coverage=trace-cmp
 kcov-flags-$(CONFIG_GCC_PLUGIN_SANCOV)		+= -fplugin=$(objtree)/scripts/gcc-plugins/sancov_plugin.so
 
+# Out-of-tree module check for GCC version mismatch.
+# See the similar check in scripts/Makefile.gcc-plugins
+ifneq ($(CONFIG_GCC_PLUGIN_SANCOV),)
+    ifneq ($(KBUILD_EXTMOD),)
+        ifneq ($(CONFIG_GCC_VERSION), $(shell $(srctree)/scripts/gcc-version.sh $(HOSTCXX)))
+            $(warning warning: Disabling CONFIG_GCC_PLUGIN_SANCOV for out-of-tree modules due to GCC version mismatch.)
+            kcov-flags-y := $(filter-out %/sancov_plugin.so, $(kcov-flags-y))
+        endif
+    endif
+endif
+
 export CFLAGS_KCOV := $(kcov-flags-y)