diff mbox

[v3,01/11] x86/boot: enumerate documentation for the x86 hardware_subarch

Message ID 1456212255-23959-2-git-send-email-mcgrof@kernel.org (mailing list archive)
State New, archived
Headers show

Commit Message

Luis Chamberlain Feb. 23, 2016, 7:24 a.m. UTC
Although hardware_subarch has been in place since the x86 boot
protocol 2.07 it hasn't been used much. Enumerate current possible
values to avoid misuses and help with semantics later at boot
time should this be used further.

v2: fix typos

Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
---
 arch/x86/include/uapi/asm/bootparam.h | 31 ++++++++++++++++++++++++++++++-
 1 file changed, 30 insertions(+), 1 deletion(-)

Comments

Ingo Molnar Feb. 23, 2016, 8:51 a.m. UTC | #1
* Luis R. Rodriguez <mcgrof@kernel.org> wrote:

> Although hardware_subarch has been in place since the x86 boot
> protocol 2.07 it hasn't been used much. Enumerate current possible
> values to avoid misuses and help with semantics later at boot
> time should this be used further.
> 
> v2: fix typos
> 
> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> ---
>  arch/x86/include/uapi/asm/bootparam.h | 31 ++++++++++++++++++++++++++++++-
>  1 file changed, 30 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h
> index 329254373479..50d5009cf276 100644
> --- a/arch/x86/include/uapi/asm/bootparam.h
> +++ b/arch/x86/include/uapi/asm/bootparam.h
> @@ -157,7 +157,36 @@ struct boot_params {
>  	__u8  _pad9[276];				/* 0xeec */
>  } __attribute__((packed));
>  
> -enum {
> +/**
> + * enum x86_hardware_subarch - x86 hardware subarchitecture
> + *
> + * The x86 hardware_subarch and hardware_subarch_data were added as of the x86
> + * boot protocol 2.07 to help distinguish and supports custom x86 boot
> + * sequences. This enum represents accepted values for the x86
> + * hardware_subarch. Custom x86 boot sequences (not X86_SUBARCH_PC) do not have
> + * or simply do not make use of natural stubs like BIOS or EFI, the
> + * hardware_subarch can be used on the Linux entry path to revector to a
> + * subarchitecture stub when needed. This subarchitecture stub can be used to
> + * set up Linux boot parameters or for special care to account for nonstandard
> + * handling of page tables.
> + *
> + * KVM and Xen HVM do not have a subarch as these are expected to follow
> + * standard x86 boot entries. If there is a genuine need for "hypervisor" type
> + * that should be considered separately in the future.
> + *
> + * @X86_SUBARCH_PC: Should be used if the hardware is enumerable using standard
> + *	PC mechanisms (PCI, ACPI) and doesn't need a special boot flow.
> + * @X86_SUBARCH_LGUEST: Used for x86 hypervisor demo, lguest
> + * @X86_SUBARCH_XEN: Used for Xen guest types which follow the PV boot path,
> + * 	which start at asm startup_xen() entry point and later jump to the C
> + * 	xen_start_kernel() entry point.
> + * @X86_SUBARCH_INTEL_MID: Used for Intel MID (Mobile Internet Device) platform
> + *	systems which do not have the PCI legacy interfaces.
> + * @X86_SUBARCH_CE4100: Used for Intel CE media processor (CE4100) SOC for
> + * 	for settop boxes and media devices, the use of a subarch for CE4100
> + * 	is more of a hack...
> + */
> +enum x86_hardware_subarch {
>  	X86_SUBARCH_PC = 0,
>  	X86_SUBARCH_LGUEST,
>  	X86_SUBARCH_XEN,

No, this is really backwards.

While I agree that we want to get rid of paravirt_enabled(), we _dont_ want to 
spread the use of (arguably broken) boot flags like this! This is one of the cases 
where the cure is worse than the disease.

The 'modern' way to handle platform quirks is to have hardware drivers with 
auto-detection, and drivers know how to detect whether the hardware is present or 
not. For legacy cases where no auto-detection is possible, we have per hardware 
capability flags to turn it off.

The 'hardware subarch' flag of the bootloader can be used to install certain 
quirks, in a single early bootup function - but that should be all: ideally no 
quirks are needed. We don't want to spread 'subarch flags' into various unrelated 
code.

Let's go over your series and see whether and how that principle can be applied:

 - patch #1, #2: should be dropped

 - patch #3, #4: EBDA support.

   The EBDA BIOS signature is an ancient data structure, starting off at
   physical memory 0x40E - which is the very first physical memory page of the
   system.

   We should add an x86_ebda_bios flag that is set to 1 by default, but which
   paravirt bootup can set to 0. That would avoid the reservation of the BIOS 
   area and will save a bit of RAM.

 - patch #5, #6, #7: looks good, does not use a subarch flag

 - patch #8: f00f workaround. Subarch flag use is wrong. The complication with 
   this workaround is that it uses MM tricks to install an IDT. Could you check 
   whether Xen truly needs this quirk? If yes then there should be a new flag, 
   something like x86_idt_readonly, which is set to 0 but Xen can set it to 1. If 
   that flag is set then the F00F workaround does not have to be installed.

   Or something like that: the point is to use a specific flag.

 - Patch #9, #10: RTC support. The problem with RTC platform driver is that it's 
   not possible to detect the RTC reliably - so we sometimes have to quirk it off.

   Instead of using bitflags, add something like x86_platform.rtc_available, which 
   defaults to 1. Don't add negation to the name and don't use bitflags - use a 
   byte flag.

 - Patch #11: this patch wants to disable the PNP BIOS code.

   The complication with the PNP BIOS code is that the PNP BIOS is defined 
   in physical memory, in the 0xf0000-0xffff0 memory range - a 64kB large area. 
   Paravirt images don't want to waste 64kB of RAM just to tell the kernel that no
   PNP BIOS is available.

   But instead of using subarch flags, please define a pnpbios_disable() API call 
   that paravirt bootup can call into. That can disable the PNP BIOS code.

Also, feel free to define a single 'disable PC legacies' quirk function that 
disables all the usual PC legacies that paravirt does not care about or does not 
want to define via RAM. Specific paravirt initialization functions would only have 
to call this one (shared) function.

Thanks,

	Ingo
Luis Chamberlain Feb. 23, 2016, 10:34 a.m. UTC | #2
On Tue, Feb 23, 2016 at 09:51:19AM +0100, Ingo Molnar wrote:
> 
> * Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> 
> > Although hardware_subarch has been in place since the x86 boot
> > protocol 2.07 it hasn't been used much. Enumerate current possible
> > values to avoid misuses and help with semantics later at boot
> > time should this be used further.
> > 
> > v2: fix typos
> > 
> > Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> > Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> > ---
> >  arch/x86/include/uapi/asm/bootparam.h | 31 ++++++++++++++++++++++++++++++-
> >  1 file changed, 30 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h
> > index 329254373479..50d5009cf276 100644
> > --- a/arch/x86/include/uapi/asm/bootparam.h
> > +++ b/arch/x86/include/uapi/asm/bootparam.h
> > @@ -157,7 +157,36 @@ struct boot_params {
> >  	__u8  _pad9[276];				/* 0xeec */
> >  } __attribute__((packed));
> >  
> > -enum {
> > +/**
> > + * enum x86_hardware_subarch - x86 hardware subarchitecture
> > + *
> > + * The x86 hardware_subarch and hardware_subarch_data were added as of the x86
> > + * boot protocol 2.07 to help distinguish and supports custom x86 boot
> > + * sequences. This enum represents accepted values for the x86
> > + * hardware_subarch. Custom x86 boot sequences (not X86_SUBARCH_PC) do not have
> > + * or simply do not make use of natural stubs like BIOS or EFI, the
> > + * hardware_subarch can be used on the Linux entry path to revector to a
> > + * subarchitecture stub when needed. This subarchitecture stub can be used to
> > + * set up Linux boot parameters or for special care to account for nonstandard
> > + * handling of page tables.
> > + *
> > + * KVM and Xen HVM do not have a subarch as these are expected to follow
> > + * standard x86 boot entries. If there is a genuine need for "hypervisor" type
> > + * that should be considered separately in the future.
> > + *
> > + * @X86_SUBARCH_PC: Should be used if the hardware is enumerable using standard
> > + *	PC mechanisms (PCI, ACPI) and doesn't need a special boot flow.
> > + * @X86_SUBARCH_LGUEST: Used for x86 hypervisor demo, lguest
> > + * @X86_SUBARCH_XEN: Used for Xen guest types which follow the PV boot path,
> > + * 	which start at asm startup_xen() entry point and later jump to the C
> > + * 	xen_start_kernel() entry point.
> > + * @X86_SUBARCH_INTEL_MID: Used for Intel MID (Mobile Internet Device) platform
> > + *	systems which do not have the PCI legacy interfaces.
> > + * @X86_SUBARCH_CE4100: Used for Intel CE media processor (CE4100) SOC for
> > + * 	for settop boxes and media devices, the use of a subarch for CE4100
> > + * 	is more of a hack...
> > + */
> > +enum x86_hardware_subarch {
> >  	X86_SUBARCH_PC = 0,
> >  	X86_SUBARCH_LGUEST,
> >  	X86_SUBARCH_XEN,
> 
> No, this is really backwards.
> 
> While I agree that we want to get rid of paravirt_enabled(), we _dont_ want to 
> spread the use of (arguably broken) boot flags like this!

I agree that we should not see the spread of boot flags around general x86
code, its not my goal to spread it though, the code that uses it here though is
*early boot code* (although in retrospect the pnpbios use was a fuckup), and I
have some special considerations for early boot code which I think does give
merit to it use. But also keep in mind my goal is to rather fold the boot flag
so its more just an architectural consideration eventually. More on this below.

> This is one of the cases 
> where the cure is worse than the disease.
> 
> The 'modern' way to handle platform quirks is to have hardware drivers with 
> auto-detection, and drivers know how to detect whether the hardware is present or 
> not. For legacy cases where no auto-detection is possible, we have per hardware 
> capability flags to turn it off.
> 
> The 'hardware subarch' flag of the bootloader can be used to install certain 
> quirks, in a single early bootup function - but that should be all: ideally no 
> quirks are needed. We don't want to spread 'subarch flags' into various unrelated 
> code.

There's another possible use for the subarch which I intend on using later for
early boot code to proactively help with disconnect on modifications of the
different x86 entry points. A summary of this is, at times x86 may get some
enhancement / feature which requires some small modifications on early boot
code and if only the bare metal entry point gets vetted on development then
chances are high the xen entry path will be faulty. At times some "feature"
or enhancement may be rather small, and all we need is a respective call
placed on the Xen entry path. Such was the case when the cr4 shadow was
added, since Xen didn't get it Xen crashed. Adding the call on the Xen
entry path fixed the issue later. Other times features may require more
work, such is the case with Kasan. Kasan currently only works on bare
metal and KVM, but not on Xen when PV guests are used. There was an oversight
on review. When I pointed the possible issue with Kasan it was thought that
adding support for Kasan for Xen would be trivial but upon review with the
Kasan maintainer we've realized it may take a bit more work. I've also
confirmed that enabling it on Xen crashes it. There's no fix in sight
and we can't disable Kasan at run time...

What I'm after is a proactive stop-gap that helps us ensure *early boot*
requirements are spelled out so that if some developer is adding a new
feature which mucks with *early boot code* they'll have to consider both entry
points. It will be fine if they only add support for bare metal -- but at least
it will be clearly annotated then. With this in place for instance I would
have expected Kasan would not have been merged unless Xen was also supportedddd
or at the very least Kasan got support to be disabled at run time in early
boot (and then Xen would disable it until it gets support for it). I won't bore
you with more details but if you want to read more about what I mean:

http://www.do-not-panic.com/2015/12/avoiding-dead-code-pvops-not-silver-bullet.html
http://www.do-not-panic.com/2015/12/xen-and-x86-linux-zero-page.html

Now just keep this type of proactive use case in mind for now, its one of
the things I was originally after with the linker tables and x86's use
for it.

> Let's go over your series and see whether and how that principle can be applied:
> 
>  - patch #1, #2: should be dropped
> 
>  - patch #3, #4: EBDA support.
> 
>    The EBDA BIOS signature is an ancient data structure, starting off at
>    physical memory 0x40E - which is the very first physical memory page of the
>    system.
> 
>    We should add an x86_ebda_bios flag that is set to 1 by default, but which
>    paravirt bootup can set to 0. That would avoid the reservation of the BIOS 
>    area and will save a bit of RAM.

Using a flag is certainly one way to go about it. With the subarch approach
I just wanted to point out that my goal was to eventually fold it underneath
the code, we'd just declare the ebda reservation as follows:

x86_init_early_pc_simple(reserve_ebda_region);

See:

http://lkml.kernel.org/r/1455891343-10016-4-git-send-email-mcgrof@kernel.org

Since linker tables are used the ebda call would just be one of the init
calls that x86_init_fn_early_init() would call. The relevant stub for that:

void __init x86_init_fn_early_init(void)                                         
{                                                                               
        struct x86_init_fn *init_fn;                                            
        unsigned int num_inits = LINKTABLE_SIZE(x86_init_fns);                  
                                                                                
        if (!num_inits)                                                         
                return;                                                         
                                                                                
        pr_debug("Number of init entries: %d\n", num_inits);                    
                                                                                
        LINKTABLE_FOR_EACH(init_fn, x86_init_fns) {                             
                if (!x86_init_fn_supports_subarch(init_fn))                     
                        continue;   
		...
		init_fn->early_init();
	}
}

And its helper:

static bool x86_init_fn_supports_subarch(struct x86_init_fn *fn)                
{                                                                               
        if (!fn->supp_hardware_subarch) {                                       
                pr_err("Init sequence fails to declares any supported subarchs: %pF\n", fn->early_init);
                WARN_ON(1);                                                     
        }                                                                       
        if (BIT(boot_params.hdr.hardware_subarch) & fn->supp_hardware_subarch)  
                return true;                                                    
        return false;                                                           
} 

struct x86_init_fn would have a __u32 supp_hardware_subarch of supported
subarchs as a bitmask.

What gains would we have to fold such a subarch check on early code?
Well for starters it can provide a proactive means stop-gap measure
for discrepancies between the different entry points to early boot
code. When developers write new early boot code they'd have to
think about which subarchs are supported.

So we can take either approach:

  a) use subarch with the plan to only use it only early x86
     boot code, in this case something like x86_init_early_pc_simple().
  b) add x86_ebda_bios flag and have PV disable it as you note

But I think if we wanted to remain compatible with a proactive strategy
using a) strategy we'd need a flag 'per feature'? I think approach a) scales
a bit more easily in this regard.

Mind you, if you say ebda should not use subarch I understand that it does not
mean other code which I'm adding should not use this linker table arrangement
and subarch, in particular if the subarch was already used (see use of linker
table on MID early boot code [0]), but some guidance if it should not
even be considered for something proactive as I'm trying to achieve as
with kasan (as an example) would be appreciated.

[0] http://lkml.kernel.org/r/1455891343-10016-7-git-send-email-mcgrof@kernel.org

>  - patch #5, #6, #7: looks good, does not use a subarch flag

OK thanks.

>  - patch #8: f00f workaround. Subarch flag use is wrong. The complication with 
>    this workaround is that it uses MM tricks to install an IDT. Could you check 
>    whether Xen truly needs this quirk?

I reviewed this at the Xen developer summit and have been through these
patches: the theory goes that xen and lguest do not need the f00f workaround
(so as per your language needs the quirk) as the hypervisor should be the one
doing the work around. Konrad also mentioned though that he actually thought
that perhaps the work around alone could be removed from the kernel all
together already -- but he was unable to provide a direct confirmation of that.

So yes we need it.

>    If yes then there should be a new flag, 
>    something like x86_idt_readonly, which is set to 0 but Xen can set it to 1. If 
>    that flag is set then the F00F workaround does not have to be installed.
> 
>    Or something like that: the point is to use a specific flag.

OK.

>  - Patch #9, #10: RTC support. The problem with RTC platform driver is that it's 
>    not possible to detect the RTC reliably - so we sometimes have to quirk it off.
> 
>    Instead of using bitflags, add something like x86_platform.rtc_available, which 
>    defaults to 1. Don't add negation to the name and don't use bitflags - use a 
>    byte flag.

OK.

>  - Patch #11: this patch wants to disable the PNP BIOS code.
> 
>    The complication with the PNP BIOS code is that the PNP BIOS is defined 
>    in physical memory, in the 0xf0000-0xffff0 memory range - a 64kB large area. 
>    Paravirt images don't want to waste 64kB of RAM just to tell the kernel that no
>    PNP BIOS is available.
> 
>    But instead of using subarch flags, please define a pnpbios_disable() API call 
>    that paravirt bootup can call into. That can disable the PNP BIOS code.

OK, sure, much better thanks.

> Also, feel free to define a single 'disable PC legacies' quirk function that 
> disables all the usual PC legacies that paravirt does not care about or does not 
> want to define via RAM. Specific paravirt initialization functions would only have 
> to call this one (shared) function.

Sure.

  Luis
Luis Chamberlain Feb. 23, 2016, 8:41 p.m. UTC | #3
On Tue, Feb 23, 2016 at 11:34:09AM +0100, Luis R. Rodriguez wrote:
> On Tue, Feb 23, 2016 at 09:51:19AM +0100, Ingo Molnar wrote:
> > 
> > * Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> > 
> > > Although hardware_subarch has been in place since the x86 boot
> > > protocol 2.07 it hasn't been used much. Enumerate current possible
> > > values to avoid misuses and help with semantics later at boot
> > > time should this be used further.
> > > 
> > > v2: fix typos
> > > 
> > > Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> > > Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
> > > ---
> > >  arch/x86/include/uapi/asm/bootparam.h | 31 ++++++++++++++++++++++++++++++-
> > >  1 file changed, 30 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h
> > > index 329254373479..50d5009cf276 100644
> > > --- a/arch/x86/include/uapi/asm/bootparam.h
> > > +++ b/arch/x86/include/uapi/asm/bootparam.h
> > > @@ -157,7 +157,36 @@ struct boot_params {
> > >  	__u8  _pad9[276];				/* 0xeec */
> > >  } __attribute__((packed));
> > >  
> > > -enum {
> > > +/**
> > > + * enum x86_hardware_subarch - x86 hardware subarchitecture
> > > + *
> > > + * The x86 hardware_subarch and hardware_subarch_data were added as of the x86
> > > + * boot protocol 2.07 to help distinguish and supports custom x86 boot
> > > + * sequences. This enum represents accepted values for the x86
> > > + * hardware_subarch. Custom x86 boot sequences (not X86_SUBARCH_PC) do not have
> > > + * or simply do not make use of natural stubs like BIOS or EFI, the
> > > + * hardware_subarch can be used on the Linux entry path to revector to a
> > > + * subarchitecture stub when needed. This subarchitecture stub can be used to
> > > + * set up Linux boot parameters or for special care to account for nonstandard
> > > + * handling of page tables.
> > > + *
> > > + * KVM and Xen HVM do not have a subarch as these are expected to follow
> > > + * standard x86 boot entries. If there is a genuine need for "hypervisor" type
> > > + * that should be considered separately in the future.
> > > + *
> > > + * @X86_SUBARCH_PC: Should be used if the hardware is enumerable using standard
> > > + *	PC mechanisms (PCI, ACPI) and doesn't need a special boot flow.
> > > + * @X86_SUBARCH_LGUEST: Used for x86 hypervisor demo, lguest
> > > + * @X86_SUBARCH_XEN: Used for Xen guest types which follow the PV boot path,
> > > + * 	which start at asm startup_xen() entry point and later jump to the C
> > > + * 	xen_start_kernel() entry point.
> > > + * @X86_SUBARCH_INTEL_MID: Used for Intel MID (Mobile Internet Device) platform
> > > + *	systems which do not have the PCI legacy interfaces.
> > > + * @X86_SUBARCH_CE4100: Used for Intel CE media processor (CE4100) SOC for
> > > + * 	for settop boxes and media devices, the use of a subarch for CE4100
> > > + * 	is more of a hack...
> > > + */
> > > +enum x86_hardware_subarch {
> > >  	X86_SUBARCH_PC = 0,
> > >  	X86_SUBARCH_LGUEST,
> > >  	X86_SUBARCH_XEN,
> > 
> > No, this is really backwards.
> > 
> > While I agree that we want to get rid of paravirt_enabled(), we _dont_ want to 
> > spread the use of (arguably broken) boot flags like this!
> 
> I agree that we should not see the spread of boot flags around general x86
> code, its not my goal to spread it though, the code that uses it here though is
> *early boot code* (although in retrospect the pnpbios use was a fuckup), and I
> have some special considerations for early boot code which I think does give
> merit to it use. But also keep in mind my goal is to rather fold the boot flag
> so its more just an architectural consideration eventually. More on this below.

It seems I TL;DR suck; all this is a long winded way of asking, can we keep the
subarch just for EBDA and use the flags for the other things as you noted?

  Luis
Ingo Molnar Feb. 24, 2016, 8:32 a.m. UTC | #4
* Luis R. Rodriguez <mcgrof@kernel.org> wrote:

> > > > +enum x86_hardware_subarch {
> > > >  	X86_SUBARCH_PC = 0,
> > > >  	X86_SUBARCH_LGUEST,
> > > >  	X86_SUBARCH_XEN,
> > > 
> > > No, this is really backwards.
> > > 
> > > While I agree that we want to get rid of paravirt_enabled(), we _dont_ want to 
> > > spread the use of (arguably broken) boot flags like this!
> > 
> > I agree that we should not see the spread of boot flags around general x86 
> > code, its not my goal to spread it though, the code that uses it here though 
> > is *early boot code* (although in retrospect the pnpbios use was a fuckup), 
> > and I have some special considerations for early boot code which I think does 
> > give merit to it use. But also keep in mind my goal is to rather fold the boot 
> > flag so its more just an architectural consideration eventually. More on this 
> > below.
> 
> It seems I TL;DR suck; all this is a long winded way of asking, can we keep the 
> subarch just for EBDA and use the flags for the other things as you noted?

No, even for EBDA we should make the flag EBDA specific, because that really tells 
us what it's about.

The EBDA code could not care less about what subarch it's running on - it only 
cares about whether it's safe to search for the EBDA signature in the hardware's 
low RAM.

So instead of this:

@@ -38,7 +38,7 @@ void __init reserve_ebda_region(void)
         * that the paravirt case can handle memory setup
         * correctly, without our help.
         */
-       if (paravirt_enabled())
+       if (boot_params.hdr.hardware_subarch != X86_SUBARCH_PC)
                return;

I'd suggest adding an EBDA search flag, something like this:

	if (!x86_platform.legacy.ebda_search)
		return;

Note that the 'legacy' intermediate structure can be used to group various 
legacies, while making it clear that this is not something modern hardware should 
care about.

The x86_platform.legacy.ebda_search flag can then be set up during (very) early 
bootup:

	if (boot_params.hdr.hardware_subarch == X86_SUBARCH_PC)
		x86_platform.legacy.ebda_search = 1;

This might look like an unnecessary indirection but it isn't: beyond the 
documentation value it also makes things a lot clearer should we introduce other 
legacy flags in x86_platform.legacy.

For hard coded platform quirks I'd suggest we add x86_platform.quirks flags. For 
example the F00F hack for Xen could be done via:

	x86_platform.quirks.idt_remap = 0;

and would be set like this during early init:

void early_init_platform_quirks(void)
{
	x86_platform.legacy.ebda_search = 0;
	x86_platform.quirks.idt_remap = 1;

	switch (boot_params.hdr.hardware_subarch) {
		case X86_SUBARCH_PC:
			x86_platform.legacy.ebda_search = 1;
			break;
		case X86_SUBARCH_XEN:
			x86_platform.quirks.idt_remap = 0;
			break;
		case X86_SUBARCH_LGUEST:
			x86_platform.quirks.idt_remap = 0;
			break;
	}
}

And if also add the legacy RTC flag, it becomes:

void early_init_hardcoded_platform_quirks(void)
{
	x86_platform.legacy.ebda_search = 0;
	x86_platform.quirks.idt_remap = 1;
	x86_platform.legacy.rtc = 1;

	switch (boot_params.hdr.hardware_subarch) {
		case X86_SUBARCH_PC:
			x86_platform.legacy.ebda_search = 1;
			break;
		case X86_SUBARCH_XEN:
			x86_platform.quirks.idt_remap = 0;
			x86_platform.legacy.rtc = 0;
			break;
		case X86_SUBARCH_LGUEST:
			x86_platform.quirks.idt_remap = 0;
			x86_platform.legacy.rtc = 0;
			break;
	}
}

Note that both opt-in and opt-out quirks/legacies are possible this way, and note 
how cleanly and consistently it's all organized: setup of all hard coded 
legacies/quirks is concentrated in a single function, and the actual usage sites 
don't know anything about subarchitectures.

Ok? Functionally this approach is equivalent to your series, but it's cleaner, and 
it's also easier to maintain in the long run: there's only a single place to look 
to check all hard coded legacies/quirks - instead of the code being spread out all 
around the x86 code.

Furthermore we should probably move a few other existing legacies to this flag 
space as well, to make all this more consistent.

Thanks,

	Ingo
Andy Lutomirski Feb. 24, 2016, 4:40 p.m. UTC | #5
On Feb 24, 2016 12:33 AM, "Ingo Molnar" <mingo@kernel.org> wrote:
>
> For hard coded platform quirks I'd suggest we add x86_platform.quirks flags. For
> example the F00F hack for Xen could be done via:
>
>         x86_platform.quirks.idt_remap = 0;
>

Don't we unconditionally remap the IDT?  I think Kees did it for
general purpose hardening due to our complete inability to hide the
IDT address. I.e. I think we can remove the f00f condition entirely.

Other than that, I agree with you.
Luis Chamberlain Feb. 25, 2016, 1:18 a.m. UTC | #6
On Feb 24, 2016 8:40 AM, "Andy Lutomirski" <luto@amacapital.net> wrote:
>
> On Feb 24, 2016 12:33 AM, "Ingo Molnar" <mingo@kernel.org> wrote:
> >
> > For hard coded platform quirks I'd suggest we add x86_platform.quirks
flags. For
> > example the F00F hack for Xen could be done via:
> >
> >         x86_platform.quirks.idt_remap = 0;
> >
>
> Don't we unconditionally remap the IDT?  I think Kees did it for
> general purpose hardening due to our complete inability to hide the
> IDT address. I.e. I think we can remove the f00f condition entirely.
>

Kees can you confirm ?

  Luis
Andy Lutomirski Feb. 25, 2016, 1:29 a.m. UTC | #7
On Wed, Feb 24, 2016 at 5:18 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
>
> On Feb 24, 2016 8:40 AM, "Andy Lutomirski" <luto@amacapital.net> wrote:
>>
>> On Feb 24, 2016 12:33 AM, "Ingo Molnar" <mingo@kernel.org> wrote:
>> >
>> > For hard coded platform quirks I'd suggest we add x86_platform.quirks
>> > flags. For
>> > example the F00F hack for Xen could be done via:
>> >
>> >         x86_platform.quirks.idt_remap = 0;
>> >
>>
>> Don't we unconditionally remap the IDT?  I think Kees did it for
>> general purpose hardening due to our complete inability to hide the
>> IDT address. I.e. I think we can remove the f00f condition entirely.
>>
>
> Kees can you confirm ?
>

No need.

        /*
         * Set the IDT descriptor to a fixed read-only location, so that the
         * "sidt" instruction will not leak the location of the kernel, and
         * to defend the IDT against arbitrary memory write vulnerabilities.
         * It will be reloaded in cpu_init() */
        __set_fixmap(FIX_RO_IDT, __pa_symbol(idt_table), PAGE_KERNEL_RO);
        idt_descr.address = fix_to_virt(FIX_RO_IDT);

IIUI this works around f00f as a side effect.  The only other thing
needed is the code that X86_BUG_F00F guards, which is responsible for
fixing up the error generated on attempted F00F exploitation from an
OOPS to a SIGILL.  I see no reason why that code couldn't be allowed
to run on even a PV guest on a F00F-affected CPU -- it would never
trigger anyway.
Ingo Molnar Feb. 25, 2016, 8:10 a.m. UTC | #8
* Andy Lutomirski <luto@amacapital.net> wrote:

> On Feb 24, 2016 12:33 AM, "Ingo Molnar" <mingo@kernel.org> wrote:
> >
> > For hard coded platform quirks I'd suggest we add x86_platform.quirks flags. For
> > example the F00F hack for Xen could be done via:
> >
> >         x86_platform.quirks.idt_remap = 0;
> >
> 
> Don't we unconditionally remap the IDT?  I think Kees did it for
> general purpose hardening due to our complete inability to hide the
> IDT address. I.e. I think we can remove the f00f condition entirely.

Yeah, indeed - I only judged by the (limited) patch context and assumed the Xen 
problem was with IDT remapping.

But what the quirk really does is only to avoid printing the f00f workaround - 
i.e. a cosmetic change. I think we should just drop the paravirt_enabled() check.

Thanks,

	Ingo
Luis Chamberlain March 2, 2016, 12:43 a.m. UTC | #9
On Wed, Feb 24, 2016 at 09:32:59AM +0100, Ingo Molnar wrote:
> And if also add the legacy RTC flag, it becomes:
> 
> void early_init_hardcoded_platform_quirks(void)
> {
> 	x86_platform.legacy.ebda_search = 0;
> 	x86_platform.quirks.idt_remap = 1;
> 	x86_platform.legacy.rtc = 1;
> 
> 	switch (boot_params.hdr.hardware_subarch) {
> 		case X86_SUBARCH_PC:
> 			x86_platform.legacy.ebda_search = 1;
> 			break;
> 		case X86_SUBARCH_XEN:
> 			x86_platform.quirks.idt_remap = 0;
> 			x86_platform.legacy.rtc = 0;
> 			break;
> 		case X86_SUBARCH_LGUEST:
> 			x86_platform.quirks.idt_remap = 0;
> 			x86_platform.legacy.rtc = 0;
> 			break;
> 	}
> }
> 
> Note that both opt-in and opt-out quirks/legacies are possible this way, and note 
> how cleanly and consistently it's all organized: setup of all hard coded 
> legacies/quirks is concentrated in a single function, and the actual usage sites 
> don't know anything about subarchitectures.

I've tried this and so far so good, and I agree and I like how the quirks
are bundled in one place.

> Ok? Functionally this approach is equivalent to your series, but it's cleaner, and 
> it's also easier to maintain in the long run: there's only a single place to look 
> to check all hard coded legacies/quirks - instead of the code being spread out all 
> around the x86 code.

Sure, but I'll note I was not intending on spreading use of subarch further,
the use of subarch in pnpbios was certainly an overlooked mistake on my part.

There's only one problem with this strategy I can think so far which differs
from my original approach, which is partly why I actually started looking at
this stuff:

  it doesn't help us pro-actively vet each early boot sequence
  thrown at the x86 path well work on all required subarchs

The scope of addressing requirements I'm trying to address are things stuffed
into x86 init path or the kernel's init path from the first entry point down
to perhaps arch specific setup_arch() calls. Now granted we don't get
modifications in this area a lot, but when we do, if folks did not consider
our different requirements (supporting each subarch/entry path) chances are
high we'll crash or have a feature not fixed / not considered a subarch.

Case in point, kasan. Its still busted on Xen and no one seems to care.
Too late. How do we proactively prevent such things ? Granted peer review
should have caught this, but why not also do something proactively ?

The quirks stuff / proactive solution should perhaps be considered orthogonal.
It just so happened that I was able to address some quirks with what I was
doing, and obviously there's a better way for those. The use of
paravirt_enabled() for quirks also happened to reveal the lack of clarity
on semantics between paravirtualized environments / non-PV hypervisors and
late boot PV/hypervisor checks.

If you can think of another proactive strategy let me know.

> Furthermore we should probably move a few other existing legacies to this flag 
> space as well, to make all this more consistent.

Indeed.

  Luis
Luis Chamberlain March 2, 2016, 7:40 p.m. UTC | #10
On Wed, Mar 02, 2016 at 01:43:42AM +0100, Luis R. Rodriguez wrote:
> On Wed, Feb 24, 2016 at 09:32:59AM +0100, Ingo Molnar wrote:
> There's only one problem with this strategy I can think so far which differs
> from my original approach, which is partly why I actually started looking at
> this stuff:
> 
>   it doesn't help us pro-actively vet each early boot sequence
>   thrown at the x86 path well work on all required subarchs
> 
> The quirks stuff / proactive solution should perhaps be considered orthogonal.
> It just so happened that I was able to address some quirks with what I was

Since it is orthogonal I'll simply work off on the paravirt_enabled() removal
separately as its possible. Since the clarity on semantics will be needed
for other work I'm doing (proactive solution to avoid issues on early boot)
and since the proposed alternative still uses subarch for the quirks as
you recommended I'll at least still push for documentation update on subarch
use as well for now.

After this, and then after sort a simple link table upstream I can then start
focusing more on the proactive solution once again. That should help keep
things separate and make it clearer what I'm trying to achieve later.

  Luis
Luis Chamberlain April 7, 2016, 8:59 p.m. UTC | #11
David, please note below the highlighted code.

On Wed, Feb 24, 2016 at 09:32:59AM +0100, Ingo Molnar wrote:
> 
> * Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> For hard coded platform quirks I'd suggest we add x86_platform.quirks flags. For 
> example the F00F hack for Xen could be done via:
> 
> 	x86_platform.quirks.idt_remap = 0;
> 
> and would be set like this during early init:
> 
> void early_init_platform_quirks(void)
> {
> 	x86_platform.legacy.ebda_search = 0;
> 	x86_platform.quirks.idt_remap = 1;
> 
> 	switch (boot_params.hdr.hardware_subarch) {
> 		case X86_SUBARCH_PC:
> 			x86_platform.legacy.ebda_search = 1;
> 			break;
> 		case X86_SUBARCH_XEN:
> 			x86_platform.quirks.idt_remap = 0;
> 			break;
> 		case X86_SUBARCH_LGUEST:
> 			x86_platform.quirks.idt_remap = 0;
> 			break;
> 	}
> }
> 
> And if also add the legacy RTC flag, it becomes:
> 
> void early_init_hardcoded_platform_quirks(void)
> {
> 	x86_platform.legacy.ebda_search = 0;
> 	x86_platform.quirks.idt_remap = 1;
> 	x86_platform.legacy.rtc = 1;
> 
> 	switch (boot_params.hdr.hardware_subarch) {
> 		case X86_SUBARCH_PC:
> 			x86_platform.legacy.ebda_search = 1;
> 			break;
> 		case X86_SUBARCH_XEN:
> 			x86_platform.quirks.idt_remap = 0;
> 			x86_platform.legacy.rtc = 0;
> 			break;
> 		case X86_SUBARCH_LGUEST:
> 			x86_platform.quirks.idt_remap = 0;
> 			x86_platform.legacy.rtc = 0;
> 			break;
> 	}
> }
> 
> Note that both opt-in and opt-out quirks/legacies are possible this way, and note 
> how cleanly and consistently it's all organized: setup of all hard coded 
> legacies/quirks is concentrated in a single function, and the actual usage sites 
> don't know anything about subarchitectures.

<-- snip -- > 

So.. I went with Ingo's template.

> Furthermore we should probably move a few other existing legacies to this flag 
> space as well, to make all this more consistent.

And this suggestion should explain a bit of the effort I put into making
room for other legacy things, which I'll elaborate in the other thread.

  Luis
diff mbox

Patch

diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h
index 329254373479..50d5009cf276 100644
--- a/arch/x86/include/uapi/asm/bootparam.h
+++ b/arch/x86/include/uapi/asm/bootparam.h
@@ -157,7 +157,36 @@  struct boot_params {
 	__u8  _pad9[276];				/* 0xeec */
 } __attribute__((packed));
 
-enum {
+/**
+ * enum x86_hardware_subarch - x86 hardware subarchitecture
+ *
+ * The x86 hardware_subarch and hardware_subarch_data were added as of the x86
+ * boot protocol 2.07 to help distinguish and supports custom x86 boot
+ * sequences. This enum represents accepted values for the x86
+ * hardware_subarch. Custom x86 boot sequences (not X86_SUBARCH_PC) do not have
+ * or simply do not make use of natural stubs like BIOS or EFI, the
+ * hardware_subarch can be used on the Linux entry path to revector to a
+ * subarchitecture stub when needed. This subarchitecture stub can be used to
+ * set up Linux boot parameters or for special care to account for nonstandard
+ * handling of page tables.
+ *
+ * KVM and Xen HVM do not have a subarch as these are expected to follow
+ * standard x86 boot entries. If there is a genuine need for "hypervisor" type
+ * that should be considered separately in the future.
+ *
+ * @X86_SUBARCH_PC: Should be used if the hardware is enumerable using standard
+ *	PC mechanisms (PCI, ACPI) and doesn't need a special boot flow.
+ * @X86_SUBARCH_LGUEST: Used for x86 hypervisor demo, lguest
+ * @X86_SUBARCH_XEN: Used for Xen guest types which follow the PV boot path,
+ * 	which start at asm startup_xen() entry point and later jump to the C
+ * 	xen_start_kernel() entry point.
+ * @X86_SUBARCH_INTEL_MID: Used for Intel MID (Mobile Internet Device) platform
+ *	systems which do not have the PCI legacy interfaces.
+ * @X86_SUBARCH_CE4100: Used for Intel CE media processor (CE4100) SOC for
+ * 	for settop boxes and media devices, the use of a subarch for CE4100
+ * 	is more of a hack...
+ */
+enum x86_hardware_subarch {
 	X86_SUBARCH_PC = 0,
 	X86_SUBARCH_LGUEST,
 	X86_SUBARCH_XEN,