Message ID | 20160120210014.GF4769@char.us.oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed, Jan 20, 2016 at 1:00 PM, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: >> +static bool x86_init_fn_supports_subarch(struct x86_init_fn *fn) >> +{ >> + if (!fn->supp_hardware_subarch) { >> + pr_err("Init sequence fails to declares any supported subarchs: %pF\n", fn->early_init); >> + WARN_ON(1); >> + } >> + if (BIT(boot_params.hdr.hardware_subarch) & fn->supp_hardware_subarch) >> + return true; >> + return false; >> +} > > So the logic for this working is that boot_params.hdr.hardware_subarch > > And the macros define two: BIT(X86_SUBARCH_PC) or BIT(X86_SUBARCH_XEN). > > But hardware_subarch by default is set to zero. Which means if GRUB2, PXELinux, Xen multiboot1 > don't set it - then the X86_SUBARCH_PC is choosen right? > > 1 << 0 & 1 << X86_SUBARCH_PC (which is zero). > > For this to nicely work with Xen it ought to do this: > > diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c > index 993b7a7..6cf9afd 100644 > --- a/arch/x86/xen/enlighten.c > +++ b/arch/x86/xen/enlighten.c > @@ -1676,6 +1676,7 @@ asmlinkage __visible void __init xen_start_kernel(void) > boot_params.hdr.ramdisk_image = initrd_start; > boot_params.hdr.ramdisk_size = xen_start_info->mod_len; > boot_params.hdr.cmd_line_ptr = __pa(xen_start_info->cmd_line); > + boot_params.hdr.hardware_subarch = X86_SUBARCH_XEN; > > if (!xen_initial_domain()) { > add_preferred_console("xenboot", 0, NULL); > > > ? That's correct for PV and PVH, likewise when qemu is required for HVM qemu could set it. I have the qemu change done but that should only cover HVM. A common place to set this as well could be the hypervisor, but currently the hypervisor doesn't set any boot_params, instead a generic struct is passed and the kernel code (for any OS) is expected to interpret this and then set the required values for the OS in the init path. Long term though if we wanted to merge init further one way could be to have the hypervisor just set the zero page cleanly for the different modes. If we needed more data other than the hardware_subarch we also have the hardware_subarch_data, that's a u64 , and how that is used would be up to the subarch. In Xen's case it could do what it wants with it. That would still mean perhaps defining as part of a Xen boot protocol a place where xen specific code can count on finding more Xen data passed by the hypervisor, the xen_start_info. That is, if we wanted to merge init paths this is something to consider. One thing I considered on the question of who should set the zero page for Xen with the prospect of merging inits, or at least this subarch for both short term and long term are the obvious implications in terms of hypervisor / kernel / qemu combination requirements if the subarch is needed. Having it set in the kernel is an obvious immediate choice for PV / PVH but it means we can't merge init paths completely (down to asm inits), we'd still be able to merge some C init paths though, the first entry would still be different. Having the zero page set on the hypervisor would go long ways but it would mean a hypervisor change required. These prospects are worth discussing, specially in light of Boris's hvmlite work. Luis
On 01/20/16 13:33, Luis R. Rodriguez wrote: > > That's correct for PV and PVH, likewise when qemu is required for HVM > qemu could set it. I have the qemu change done but that should only > cover HVM. A common place to set this as well could be the hypervisor, > but currently the hypervisor doesn't set any boot_params, instead a > generic struct is passed and the kernel code (for any OS) is expected > to interpret this and then set the required values for the OS in the > init path. Long term though if we wanted to merge init further one way > could be to have the hypervisor just set the zero page cleanly for the > different modes. If we needed more data other than the > hardware_subarch we also have the hardware_subarch_data, that's a u64 > , and how that is used would be up to the subarch. In Xen's case it > could do what it wants with it. That would still mean perhaps defining > as part of a Xen boot protocol a place where xen specific code can > count on finding more Xen data passed by the hypervisor, the > xen_start_info. That is, if we wanted to merge init paths this is > something to consider. > > One thing I considered on the question of who should set the zero page > for Xen with the prospect of merging inits, or at least this subarch > for both short term and long term are the obvious implications in > terms of hypervisor / kernel / qemu combination requirements if the > subarch is needed. Having it set in the kernel is an obvious immediate > choice for PV / PVH but it means we can't merge init paths completely > (down to asm inits), we'd still be able to merge some C init paths > though, the first entry would still be different. Having the zero page > set on the hypervisor would go long ways but it would mean a > hypervisor change required. > > These prospects are worth discussing, specially in light of Boris's > hvmlite work. > The above doesn't make sense to me. hardware_subarch is really used when the boot sequence is somehow nonstandard. HVM probably doesn't need that. -hpa
On Wed, Jan 20, 2016 at 1:41 PM, H. Peter Anvin <hpa@zytor.com> wrote: > On 01/20/16 13:33, Luis R. Rodriguez wrote: >> >> That's correct for PV and PVH, likewise when qemu is required for HVM >> qemu could set it. I have the qemu change done but that should only >> cover HVM. A common place to set this as well could be the hypervisor, >> but currently the hypervisor doesn't set any boot_params, instead a >> generic struct is passed and the kernel code (for any OS) is expected >> to interpret this and then set the required values for the OS in the >> init path. Long term though if we wanted to merge init further one way >> could be to have the hypervisor just set the zero page cleanly for the >> different modes. If we needed more data other than the >> hardware_subarch we also have the hardware_subarch_data, that's a u64 >> , and how that is used would be up to the subarch. In Xen's case it >> could do what it wants with it. That would still mean perhaps defining >> as part of a Xen boot protocol a place where xen specific code can >> count on finding more Xen data passed by the hypervisor, the >> xen_start_info. That is, if we wanted to merge init paths this is >> something to consider. >> >> One thing I considered on the question of who should set the zero page >> for Xen with the prospect of merging inits, or at least this subarch >> for both short term and long term are the obvious implications in >> terms of hypervisor / kernel / qemu combination requirements if the >> subarch is needed. Having it set in the kernel is an obvious immediate >> choice for PV / PVH but it means we can't merge init paths completely >> (down to asm inits), we'd still be able to merge some C init paths >> though, the first entry would still be different. Having the zero page >> set on the hypervisor would go long ways but it would mean a >> hypervisor change required. >> >> These prospects are worth discussing, specially in light of Boris's >> hvmlite work. >> > > The above doesn't make sense to me. hardware_subarch is really used > when the boot sequence is somehow nonstandard. Thanks for the feedback -- as it stands today hardware_subarch is only used by lguest, Moorestown, and CE4100 even though we had definitions for it for Xen -- this is not used yet. Its documentation does make references to differences for a paravirtualized environment, and uses a few examples but doesn't go into great depths about restrictions so its limitations in how we could use it were not clear to me. > HVM probably doesn't need that. Today HVM doesn't need it, but perhaps that is because it has not needed changes early on boot. Will it, or could it? I'd even invite us to consider the same for other hypervisors or PV hypervisors. I'll note that things like cpu_has_hypervisor() or derivatives (kvm_para_available() which is now used on drivers even, see sound/pci/intel8x0.c) requires init_hypervisor_platform() run, in terms of the x86 init sequence this is run pretty late at setup_arch(). Should code need to know hypervisor info anytime before that they have no generic option available. I'm fine if we want to restrict hardware_subarch but I'll note the semantics were not that explicit to delineate clear differences and I just wanted to highlight the current early boot restriction of cpu_has_hypervisor(). Luis
On January 20, 2016 2:12:49 PM PST, "Luis R. Rodriguez" <mcgrof@do-not-panic.com> wrote: >On Wed, Jan 20, 2016 at 1:41 PM, H. Peter Anvin <hpa@zytor.com> wrote: >> On 01/20/16 13:33, Luis R. Rodriguez wrote: >>> >>> That's correct for PV and PVH, likewise when qemu is required for >HVM >>> qemu could set it. I have the qemu change done but that should only >>> cover HVM. A common place to set this as well could be the >hypervisor, >>> but currently the hypervisor doesn't set any boot_params, instead a >>> generic struct is passed and the kernel code (for any OS) is >expected >>> to interpret this and then set the required values for the OS in the >>> init path. Long term though if we wanted to merge init further one >way >>> could be to have the hypervisor just set the zero page cleanly for >the >>> different modes. If we needed more data other than the >>> hardware_subarch we also have the hardware_subarch_data, that's a >u64 >>> , and how that is used would be up to the subarch. In Xen's case it >>> could do what it wants with it. That would still mean perhaps >defining >>> as part of a Xen boot protocol a place where xen specific code can >>> count on finding more Xen data passed by the hypervisor, the >>> xen_start_info. That is, if we wanted to merge init paths this is >>> something to consider. >>> >>> One thing I considered on the question of who should set the zero >page >>> for Xen with the prospect of merging inits, or at least this subarch >>> for both short term and long term are the obvious implications in >>> terms of hypervisor / kernel / qemu combination requirements if the >>> subarch is needed. Having it set in the kernel is an obvious >immediate >>> choice for PV / PVH but it means we can't merge init paths >completely >>> (down to asm inits), we'd still be able to merge some C init paths >>> though, the first entry would still be different. Having the zero >page >>> set on the hypervisor would go long ways but it would mean a >>> hypervisor change required. >>> >>> These prospects are worth discussing, specially in light of Boris's >>> hvmlite work. >>> >> >> The above doesn't make sense to me. hardware_subarch is really used >> when the boot sequence is somehow nonstandard. > >Thanks for the feedback -- as it stands today hardware_subarch is only >used by lguest, Moorestown, and CE4100 even though we had definitions >for it for Xen -- this is not used yet. Its documentation does make >references to differences for a paravirtualized environment, and uses >a few examples but doesn't go into great depths about restrictions so >its limitations in how we could use it were not clear to me. > >> HVM probably doesn't need that. > >Today HVM doesn't need it, but perhaps that is because it has not >needed changes early on boot. Will it, or could it? I'd even invite us >to consider the same for other hypervisors or PV hypervisors. I'll >note that things like cpu_has_hypervisor() or derivatives >(kvm_para_available() which is now used on drivers even, see >sound/pci/intel8x0.c) requires init_hypervisor_platform() run, in >terms of the x86 init sequence this is run pretty late at >setup_arch(). Should code need to know hypervisor info anytime before >that they have no generic option available. > >I'm fine if we want to restrict hardware_subarch but I'll note the >semantics were not that explicit to delineate clear differences and I >just wanted to highlight the current early boot restriction of >cpu_has_hypervisor(). > > Luis Basically, if the hardware is enumerable using standard PC mechanisms (PCI, ACPI) and doesn't need a special boot flow it should use type 0.
El 20/01/16 a les 22.33, Luis R. Rodriguez ha escrit: > On Wed, Jan 20, 2016 at 1:00 PM, Konrad Rzeszutek Wilk > <konrad.wilk@oracle.com> wrote: >>> +static bool x86_init_fn_supports_subarch(struct x86_init_fn *fn) >>> +{ >>> + if (!fn->supp_hardware_subarch) { >>> + pr_err("Init sequence fails to declares any supported subarchs: %pF\n", fn->early_init); >>> + WARN_ON(1); >>> + } >>> + if (BIT(boot_params.hdr.hardware_subarch) & fn->supp_hardware_subarch) >>> + return true; >>> + return false; >>> +} >> >> So the logic for this working is that boot_params.hdr.hardware_subarch >> >> And the macros define two: BIT(X86_SUBARCH_PC) or BIT(X86_SUBARCH_XEN). >> >> But hardware_subarch by default is set to zero. Which means if GRUB2, PXELinux, Xen multiboot1 >> don't set it - then the X86_SUBARCH_PC is choosen right? >> >> 1 << 0 & 1 << X86_SUBARCH_PC (which is zero). >> >> For this to nicely work with Xen it ought to do this: >> >> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c >> index 993b7a7..6cf9afd 100644 >> --- a/arch/x86/xen/enlighten.c >> +++ b/arch/x86/xen/enlighten.c >> @@ -1676,6 +1676,7 @@ asmlinkage __visible void __init xen_start_kernel(void) >> boot_params.hdr.ramdisk_image = initrd_start; >> boot_params.hdr.ramdisk_size = xen_start_info->mod_len; >> boot_params.hdr.cmd_line_ptr = __pa(xen_start_info->cmd_line); >> + boot_params.hdr.hardware_subarch = X86_SUBARCH_XEN; >> >> if (!xen_initial_domain()) { >> add_preferred_console("xenboot", 0, NULL); >> >> >> ? > > That's correct for PV and PVH, likewise when qemu is required for HVM > qemu could set it. I have the qemu change done but that should only > cover HVM. A common place to set this as well could be the hypervisor, > but currently the hypervisor doesn't set any boot_params, instead a > generic struct is passed and the kernel code (for any OS) is expected > to interpret this and then set the required values for the OS in the > init path. Long term though if we wanted to merge init further one way > could be to have the hypervisor just set the zero page cleanly for the > different modes. If we needed more data other than the > hardware_subarch we also have the hardware_subarch_data, that's a u64 > , and how that is used would be up to the subarch. In Xen's case it > could do what it wants with it. That would still mean perhaps defining > as part of a Xen boot protocol a place where xen specific code can > count on finding more Xen data passed by the hypervisor, the > xen_start_info. That is, if we wanted to merge init paths this is > something to consider. > > One thing I considered on the question of who should set the zero page > for Xen with the prospect of merging inits, or at least this subarch > for both short term and long term are the obvious implications in > terms of hypervisor / kernel / qemu combination requirements if the > subarch is needed. Having it set in the kernel is an obvious immediate > choice for PV / PVH but it means we can't merge init paths completely > (down to asm inits), we'd still be able to merge some C init paths > though, the first entry would still be different. Having the zero page > set on the hypervisor would go long ways but it would mean a > hypervisor change required. I don't think the hypervisor should be setting Linux specific boot related parameters, the boot ABI should be OS agnostic. IMHO, a small shim should be added to Linux in order to set what Linux requires when entering from a Xen entry point. Roger.
On 01/21/2016 03:38 AM, Roger Pau Monné wrote: > El 20/01/16 a les 22.33, Luis R. Rodriguez ha escrit: >> On Wed, Jan 20, 2016 at 1:00 PM, Konrad Rzeszutek Wilk >> <konrad.wilk@oracle.com> wrote: >>>> +static bool x86_init_fn_supports_subarch(struct x86_init_fn *fn) >>>> +{ >>>> + if (!fn->supp_hardware_subarch) { >>>> + pr_err("Init sequence fails to declares any supported subarchs: %pF\n", fn->early_init); >>>> + WARN_ON(1); >>>> + } >>>> + if (BIT(boot_params.hdr.hardware_subarch) & fn->supp_hardware_subarch) >>>> + return true; >>>> + return false; >>>> +} >>> So the logic for this working is that boot_params.hdr.hardware_subarch >>> >>> And the macros define two: BIT(X86_SUBARCH_PC) or BIT(X86_SUBARCH_XEN). >>> >>> But hardware_subarch by default is set to zero. Which means if GRUB2, PXELinux, Xen multiboot1 >>> don't set it - then the X86_SUBARCH_PC is choosen right? >>> >>> 1 << 0 & 1 << X86_SUBARCH_PC (which is zero). >>> >>> For this to nicely work with Xen it ought to do this: >>> >>> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c >>> index 993b7a7..6cf9afd 100644 >>> --- a/arch/x86/xen/enlighten.c >>> +++ b/arch/x86/xen/enlighten.c >>> @@ -1676,6 +1676,7 @@ asmlinkage __visible void __init xen_start_kernel(void) >>> boot_params.hdr.ramdisk_image = initrd_start; >>> boot_params.hdr.ramdisk_size = xen_start_info->mod_len; >>> boot_params.hdr.cmd_line_ptr = __pa(xen_start_info->cmd_line); >>> + boot_params.hdr.hardware_subarch = X86_SUBARCH_XEN; >>> >>> if (!xen_initial_domain()) { >>> add_preferred_console("xenboot", 0, NULL); >>> >>> >>> ? >> That's correct for PV and PVH, likewise when qemu is required for HVM >> qemu could set it. I have the qemu change done but that should only >> cover HVM. A common place to set this as well could be the hypervisor, >> but currently the hypervisor doesn't set any boot_params, instead a >> generic struct is passed and the kernel code (for any OS) is expected >> to interpret this and then set the required values for the OS in the >> init path. Long term though if we wanted to merge init further one way >> could be to have the hypervisor just set the zero page cleanly for the >> different modes. If we needed more data other than the >> hardware_subarch we also have the hardware_subarch_data, that's a u64 >> , and how that is used would be up to the subarch. In Xen's case it >> could do what it wants with it. That would still mean perhaps defining >> as part of a Xen boot protocol a place where xen specific code can >> count on finding more Xen data passed by the hypervisor, the >> xen_start_info. That is, if we wanted to merge init paths this is >> something to consider. >> >> One thing I considered on the question of who should set the zero page >> for Xen with the prospect of merging inits, or at least this subarch >> for both short term and long term are the obvious implications in >> terms of hypervisor / kernel / qemu combination requirements if the >> subarch is needed. Having it set in the kernel is an obvious immediate >> choice for PV / PVH but it means we can't merge init paths completely >> (down to asm inits), we'd still be able to merge some C init paths >> though, the first entry would still be different. Having the zero page >> set on the hypervisor would go long ways but it would mean a >> hypervisor change required. > I don't think the hypervisor should be setting Linux specific boot > related parameters, the boot ABI should be OS agnostic. IMHO, a small > shim should be added to Linux in order to set what Linux requires when > entering from a Xen entry point. And that's exactly what HVMlite does. Most of this shim layer is setting up boot_params, after which we jump to standard x86 boot path (i.e. startup_{32|64}). With hardware_subarch set to zero. -boris
On 01/21/16 05:45, Boris Ostrovsky wrote: >> I don't think the hypervisor should be setting Linux specific boot >> related parameters, the boot ABI should be OS agnostic. IMHO, a small >> shim should be added to Linux in order to set what Linux requires when >> entering from a Xen entry point. For the record, this is exactly what hardware_subarch is meant to do: revector to a stub to do this kind of setup, for a subarchitecture which doesn't have a natural stub like BIOS or EFI. In the case of Xen PV, or lguest, there are special care that has to be done very early in the path due to the nonstandard handling of page tables, which is another reason for this field. > And that's exactly what HVMlite does. Most of this shim layer is setting > up boot_params, after which we jump to standard x86 boot path (i.e. > startup_{32|64}). With hardware_subarch set to zero. Which is the way to do it as long as the early code can be the same. -hpa
On Thu, Jan 21, 2016 at 11:25 AM, H. Peter Anvin <hpa@zytor.com> wrote: >> And that's exactly what HVMlite does. Most of this shim layer is setting >> up boot_params, after which we jump to standard x86 boot path (i.e. >> startup_{32|64}). With hardware_subarch set to zero. > > Which is the way to do it as long as the early code can be the same. To be clear, with the subarchand linker table suggested in my patch series, it should be possible to have the same exact entry point, the Xen PV setup code could run early in the order. For instance in the linker table we could use the reserved order levels 01-09 for PV hypervisor code: +/* Init order levels, we can start at 01 but reserve 01-09 for now */ +#define X86_INIT_ORDER_EARLY 10 +#define X86_INIT_ORDER_NORMAL 30 +#define X86_INIT_ORDER_LATE 50 So perhaps X86_INIT_ORDER_PV as 05 later. The standard x86 init would just then be: asmlinkage __visible void __init x86_64_start_kernel(char * real_mode_data) { x86_init_fn_init_tables(); x86_init_fn_early_init(); } The PV init code would kick in early and could parse the boot_params.hdr.hardware_subarch_data pointer as it sees fit. Luis
On 01/21/16 11:46, Luis R. Rodriguez wrote: > On Thu, Jan 21, 2016 at 11:25 AM, H. Peter Anvin <hpa@zytor.com> wrote: >>> And that's exactly what HVMlite does. Most of this shim layer is setting >>> up boot_params, after which we jump to standard x86 boot path (i.e. >>> startup_{32|64}). With hardware_subarch set to zero. >> >> Which is the way to do it as long as the early code can be the same. > > To be clear, with the subarchand linker table suggested in my patch > series, it should be possible to have the same exact entry point, the > Xen PV setup code could run early in the order. For instance in the > linker table we could use the reserved order levels 01-09 for PV > hypervisor code: > > +/* Init order levels, we can start at 01 but reserve 01-09 for now */ > +#define X86_INIT_ORDER_EARLY 10 > +#define X86_INIT_ORDER_NORMAL 30 > +#define X86_INIT_ORDER_LATE 50 > > So perhaps X86_INIT_ORDER_PV as 05 later. > > The standard x86 init would just then be: > > asmlinkage __visible void __init x86_64_start_kernel(char * real_mode_data) > { > x86_init_fn_init_tables(); > x86_init_fn_early_init(); > } > > The PV init code would kick in early and could parse the > boot_params.hdr.hardware_subarch_data pointer as it sees fit. > Right... we already do that. I don't even think you need to initialize any tables. At least on i386, we have to do this in assembly code. However, it is just a simple table walk. :) -hpa
On 01/21/16 11:50, H. Peter Anvin wrote: > > Right... we already do that. > > I don't even think you need to initialize any tables. At least on i386, > we have to do this in assembly code. However, it is just a simple table > walk. :) > It might make more sense to make subarch its own table, though, although I haven't looked at your code in enough detail to say. -hpa
On Thu, Jan 21, 2016 at 11:50 AM, H. Peter Anvin <hpa@zytor.com> wrote: >> The PV init code would kick in early and could parse the >> boot_params.hdr.hardware_subarch_data pointer as it sees fit. >> > > Right... we already do that. > > I don't even think you need to initialize any tables. The init above is for the sorting. Its not needed unless your subsystem uses it, its the same sort as with the IOMMU stuff only with stronger semantics. In theory one would *not* need to do this so early, it could wait, provided you at least are confident the linker order suffices for all routines called early. To be clear, the struct proposed defines a set of callbacks for calling feature's callbacks at different levels of the x86 init path. As this patch series is concerned it had one for x86_64_start_reservations(). This could easily be extended to also include one to be called for instance at setup_arch(). If we were able to solve the ability to make use of subarch so early as of we could then have a callback for x86_64_start_kernel(), then one at x86_64_start_reservations() and perhaps one at setup_arch() later -- Kasan is an example that could fit this example in the future. What I mean is that we could avoid the sort or wait for the run time sort later until x86_64_start_reservations() or even setup_arch() provided the deafult linker order suffices for the series of previous callbacks. Its up to us but I'm taken care to be pedantic about semantics here. > At least on i386, > we have to do this in assembly code. Neat! How is that order kept? > However, it is just a simple table walk. :) Luis
On 01/21/16 12:05, Luis R. Rodriguez wrote: > >> At least on i386, >> we have to do this in assembly code. > > Neat! How is that order kept? > Right now subarch_entries[] is just an array indexed by subarch number hardcoded in head_32.S. However, if you have a list of (id, target) then you could just iterate over it. -hpa
On Thu, Jan 21, 2016 at 11:52 AM, H. Peter Anvin <hpa@zytor.com> wrote: > On 01/21/16 11:50, H. Peter Anvin wrote: >> >> Right... we already do that. >> >> I don't even think you need to initialize any tables. At least on i386, >> we have to do this in assembly code. However, it is just a simple table >> walk. :) >> > > It might make more sense to make subarch its own table, though, although > I haven't looked at your code in enough detail to say. The gist of it: void x86_init_fn_early_init() { for_each_table_entry(init_fn, X86_INIT_FNS) { if supported subarc init_fn->early_init(); } void x86_init_fn_setup_arch() { for_each_table_entry(init_fn, X86_INIT_FNS) { if supported subarc init_fn->setup_archt(); } Right now the code defines just an init routine at the stage for x86_64_start_reservations(), we call the callback early_init(), ie setup_arch() doesn't exist yet. Since certain routines can run on either Xen or bare-metal we allow a bitwise OR for the subarch as a bitmask. I'll think a bit more about how to use subarch for a table, but can't fit it yet. I'd like to later revise kfree'ing unused sections, and a table per subarch might be nice for that, but otherwise this flow seems easy to follow, so if we can kfree sections within a table we could still accomplish that later. Luis
On Wed, Jan 20, 2016 at 2:20 PM, H. Peter Anvin <hpa@zytor.com> wrote: > On January 20, 2016 2:12:49 PM PST, "Luis R. Rodriguez" <mcgrof@do-not-panic.com> wrote: >>On Wed, Jan 20, 2016 at 1:41 PM, H. Peter Anvin <hpa@zytor.com> wrote: >>> On 01/20/16 13:33, Luis R. Rodriguez wrote: >>>> >>>> That's correct for PV and PVH, likewise when qemu is required for >>HVM >>>> qemu could set it. I have the qemu change done but that should only >>>> cover HVM. A common place to set this as well could be the >>hypervisor, >>>> but currently the hypervisor doesn't set any boot_params, instead a >>>> generic struct is passed and the kernel code (for any OS) is >>expected >>>> to interpret this and then set the required values for the OS in the >>>> init path. Long term though if we wanted to merge init further one >>way >>>> could be to have the hypervisor just set the zero page cleanly for >>the >>>> different modes. If we needed more data other than the >>>> hardware_subarch we also have the hardware_subarch_data, that's a >>u64 >>>> , and how that is used would be up to the subarch. In Xen's case it >>>> could do what it wants with it. That would still mean perhaps >>defining >>>> as part of a Xen boot protocol a place where xen specific code can >>>> count on finding more Xen data passed by the hypervisor, the >>>> xen_start_info. That is, if we wanted to merge init paths this is >>>> something to consider. >>>> >>>> One thing I considered on the question of who should set the zero >>page >>>> for Xen with the prospect of merging inits, or at least this subarch >>>> for both short term and long term are the obvious implications in >>>> terms of hypervisor / kernel / qemu combination requirements if the >>>> subarch is needed. Having it set in the kernel is an obvious >>immediate >>>> choice for PV / PVH but it means we can't merge init paths >>completely >>>> (down to asm inits), we'd still be able to merge some C init paths >>>> though, the first entry would still be different. Having the zero >>page >>>> set on the hypervisor would go long ways but it would mean a >>>> hypervisor change required. >>>> >>>> These prospects are worth discussing, specially in light of Boris's >>>> hvmlite work. >>>> >>> >>> The above doesn't make sense to me. hardware_subarch is really used >>> when the boot sequence is somehow nonstandard. >> >>Thanks for the feedback -- as it stands today hardware_subarch is only >>used by lguest, Moorestown, and CE4100 even though we had definitions >>for it for Xen -- this is not used yet. Its documentation does make >>references to differences for a paravirtualized environment, and uses >>a few examples but doesn't go into great depths about restrictions so >>its limitations in how we could use it were not clear to me. >> >>> HVM probably doesn't need that. >> >>Today HVM doesn't need it, but perhaps that is because it has not >>needed changes early on boot. Will it, or could it? I'd even invite us >>to consider the same for other hypervisors or PV hypervisors. I'll >>note that things like cpu_has_hypervisor() or derivatives >>(kvm_para_available() which is now used on drivers even, see >>sound/pci/intel8x0.c) requires init_hypervisor_platform() run, in >>terms of the x86 init sequence this is run pretty late at >>setup_arch(). Should code need to know hypervisor info anytime before >>that they have no generic option available. >> >>I'm fine if we want to restrict hardware_subarch but I'll note the >>semantics were not that explicit to delineate clear differences and I >>just wanted to highlight the current early boot restriction of >>cpu_has_hypervisor(). >> >> Luis > > Basically, if the hardware is enumerable using standard PC mechanisms (PCI, ACPI) and doesn't need a special boot flow it should use type 0. I can extend the documentation as part of this to be clear. I will note though that this still leaves a gap on code which might want to access the question "are we in a virt environment, and if so on which one" in between really early boot and right before init_hypervisor_platform(). Or rather, given subarch can be used by Xen and lguest but not KVM it means KVM doesn't get to use it. It may not need it, but its also rather trivial to set up on qemu, and I have a patch for that if we wanted one for KVM. That would extend the definition of subarch a bit more, but as it stands today its use was rather limited. Heck, subharch_data is to this day unused. Luis
On 01/21/16 16:25, Luis R. Rodriguez wrote: >> >> Basically, if the hardware is enumerable using standard PC mechanisms (PCI, ACPI) and doesn't need a special boot flow it should use type 0. > > I can extend the documentation as part of this to be clear. > > I will note though that this still leaves a gap on code which might > want to access the question "are we in a virt environment, and if so > on which one" in between really early boot and right before > init_hypervisor_platform(). Or rather, given subarch can be used by > Xen and lguest but not KVM it means KVM doesn't get to use it. It may > not need it, but its also rather trivial to set up on qemu, and I have > a patch for that if we wanted one for KVM. That would extend the > definition of subarch a bit more, but as it stands today its use was > rather limited. Heck, subharch_data is to this day unused. > KVM is not a subarch, and Xen HVM isn't either; the subarch was meant to be specifically to handle nonstandard boot entries; the CE4100 extension was itself kind of a hack. If you have a genuine need for a "hypervisor type" then that is a separate thing and should be treated separately from subarch. However, you need to consider that some hypervisors can emulate other hypervisors and you may have more than one hypervisor API available. -hpa
On Thu, Jan 21, 2016 at 4:42 PM, H. Peter Anvin <hpa@zytor.com> wrote: > On 01/21/16 16:25, Luis R. Rodriguez wrote: >>> >>> Basically, if the hardware is enumerable using standard PC mechanisms (PCI, ACPI) and doesn't need a special boot flow it should use type 0. >> >> I can extend the documentation as part of this to be clear. >> >> I will note though that this still leaves a gap on code which might >> want to access the question "are we in a virt environment, and if so >> on which one" in between really early boot and right before >> init_hypervisor_platform(). Or rather, given subarch can be used by >> Xen and lguest but not KVM it means KVM doesn't get to use it. It may >> not need it, but its also rather trivial to set up on qemu, and I have >> a patch for that if we wanted one for KVM. That would extend the >> definition of subarch a bit more, but as it stands today its use was >> rather limited. Heck, subharch_data is to this day unused. >> > > KVM is not a subarch, and Xen HVM isn't either; the subarch was meant to > be specifically to handle nonstandard boot entries; the CE4100 extension > was itself kind of a hack. > > If you have a genuine need for a "hypervisor type" then that is a > separate thing and should be treated separately from subarch. However, > you need to consider that some hypervisors can emulate other hypervisors > and you may have more than one hypervisor API available. Thanks, I've pegged this onto my series as a documentation extensions for boot params. Luis
diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c index 993b7a7..6cf9afd 100644 --- a/arch/x86/xen/enlighten.c +++ b/arch/x86/xen/enlighten.c @@ -1676,6 +1676,7 @@ asmlinkage __visible void __init xen_start_kernel(void) boot_params.hdr.ramdisk_image = initrd_start; boot_params.hdr.ramdisk_size = xen_start_info->mod_len; boot_params.hdr.cmd_line_ptr = __pa(xen_start_info->cmd_line); + boot_params.hdr.hardware_subarch = X86_SUBARCH_XEN; if (!xen_initial_domain()) { add_preferred_console("xenboot", 0, NULL);