Message ID | cc0fbcb4-5ea3-178c-e691-9acb7cc9a3a7@suse.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [4.17] EFI: don't convert memory marked for runtime use to ordinary RAM | expand |
Hi Jan, We will review and test the arm part (even though it is modifying some unused code at the moment) but I wanted to answer you on some questions you have .. > On 30 Sep 2022, at 09:50, Jan Beulich <jbeulich@suse.com> wrote: > > efi_init_memory() in both relevant places is treating EFI_MEMORY_RUNTIME > higher priority than the type of the range. To avoid accessing memory at > runtime which was re-used for other purposes, make > efi_arch_process_memory_map() follow suit. While on x86 in theory the > same would apply to EfiACPIReclaimMemory, we don't actually "reclaim" > E820_ACPI memory there and hence that type's handling can be left alone. > > Fixes: bf6501a62e80 ("x86-64: EFI boot code") > Fixes: facac0af87ef ("x86-64: EFI runtime code") > Fixes: 6d70ea10d49f ("Add ARM EFI boot support") > Signed-off-by: Jan Beulich <jbeulich@suse.com> > --- > Partly RFC for Arm, for two reasons: > > On Arm I question the conversion of EfiACPIReclaimMemory, in two ways: > For one like on x86 such ranges would likely better be retained, as Dom0 > may (will?) have a need to look at tables placed there. Plus converting > such ranges to RAM even if EFI_MEMORY_WB is not set looks suspicious to > me as well. I'd be inclined to make the latter adjustment right here > (while the other change probably would better be separate, if there > aren't actually reasons for the present behavior). > > On Arm efi_init_memory() is compiled out, so adjusting Arm code here is > perhaps more for consistency (not leaving a trap for someone to later > fall into) than a strict requirement. I wonder though how Arm has > managed to get away without at least some parts of efi_init_memory() for > all the years that ACPI support has been present there. I guess this is > connected to most of runtime.c also being compiled out, but that > continuing to be the case is another aspect puzzling me. On arm we only use the boot services in Xen and we do not provide any efi services to dom0. The required info is passed through a simple device tree. There was a discussion on that subject some weeks ago and it is still an open point to be solved. Also APCI is officially unsupported on arm. Cheers Bertrand > > --- a/xen/arch/arm/efi/efi-boot.h > +++ b/xen/arch/arm/efi/efi-boot.h > @@ -183,13 +183,15 @@ static EFI_STATUS __init efi_process_mem > > for ( Index = 0; Index < (mmap_size / desc_size); Index++ ) > { > - if ( desc_ptr->Attribute & EFI_MEMORY_WB && > - (desc_ptr->Type == EfiConventionalMemory || > - desc_ptr->Type == EfiLoaderCode || > - desc_ptr->Type == EfiLoaderData || > - (!map_bs && > - (desc_ptr->Type == EfiBootServicesCode || > - desc_ptr->Type == EfiBootServicesData))) ) > + if ( desc_ptr->Attribute & EFI_MEMORY_RUNTIME ) > + /* nothing */; > + else if ( (desc_ptr->Attribute & EFI_MEMORY_WB) && > + (desc_ptr->Type == EfiConventionalMemory || > + desc_ptr->Type == EfiLoaderCode || > + desc_ptr->Type == EfiLoaderData || > + (!map_bs && > + (desc_ptr->Type == EfiBootServicesCode || > + desc_ptr->Type == EfiBootServicesData))) ) > { > if ( !meminfo_add_bank(&bootinfo.mem, desc_ptr) ) > { > --- a/xen/arch/x86/efi/efi-boot.h > +++ b/xen/arch/x86/efi/efi-boot.h > @@ -185,7 +185,9 @@ static void __init efi_arch_process_memo > /* fall through */ > case EfiLoaderCode: > case EfiLoaderData: > - if ( desc->Attribute & EFI_MEMORY_WB ) > + if ( desc->Attribute & EFI_MEMORY_RUNTIME ) > + type = E820_RESERVED; > + else if ( desc->Attribute & EFI_MEMORY_WB ) > type = E820_RAM; > else > case EfiUnusableMemory:
> On 30 Sep 2022, at 08:50, Jan Beulich <jbeulich@suse.com> wrote: > > efi_init_memory() in both relevant places is treating EFI_MEMORY_RUNTIME > higher priority than the type of the range. To avoid accessing memory at > runtime which was re-used for other purposes, make > efi_arch_process_memory_map() follow suit. While on x86 in theory the > same would apply to EfiACPIReclaimMemory, we don't actually "reclaim" > E820_ACPI memory there and hence that type's handling can be left alone. > > Fixes: bf6501a62e80 ("x86-64: EFI boot code") > Fixes: facac0af87ef ("x86-64: EFI runtime code") > Fixes: 6d70ea10d49f ("Add ARM EFI boot support") > Signed-off-by: Jan Beulich <jbeulich@suse.com> > --- Hi Jan, For the Arm part: Reviewed-by: Luca Fancellu <luca.fancellu@arm.com> I’ve also tested the EFI+ACPI boot on two arm boards Tested-By: Luca Fancellu <luca.fancellu@arm.com>
Hi Jan, > On 30 Sep 2022, at 09:50, Jan Beulich <jbeulich@suse.com> wrote: > > efi_init_memory() in both relevant places is treating EFI_MEMORY_RUNTIME > higher priority than the type of the range. To avoid accessing memory at > runtime which was re-used for other purposes, make > efi_arch_process_memory_map() follow suit. While on x86 in theory the > same would apply to EfiACPIReclaimMemory, we don't actually "reclaim" > E820_ACPI memory there and hence that type's handling can be left alone. > > Fixes: bf6501a62e80 ("x86-64: EFI boot code") > Fixes: facac0af87ef ("x86-64: EFI runtime code") > Fixes: 6d70ea10d49f ("Add ARM EFI boot support") > Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com> #arm Cheers Bertrand > --- > Partly RFC for Arm, for two reasons: > > On Arm I question the conversion of EfiACPIReclaimMemory, in two ways: > For one like on x86 such ranges would likely better be retained, as Dom0 > may (will?) have a need to look at tables placed there. Plus converting > such ranges to RAM even if EFI_MEMORY_WB is not set looks suspicious to > me as well. I'd be inclined to make the latter adjustment right here > (while the other change probably would better be separate, if there > aren't actually reasons for the present behavior). > > On Arm efi_init_memory() is compiled out, so adjusting Arm code here is > perhaps more for consistency (not leaving a trap for someone to later > fall into) than a strict requirement. I wonder though how Arm has > managed to get away without at least some parts of efi_init_memory() for > all the years that ACPI support has been present there. I guess this is > connected to most of runtime.c also being compiled out, but that > continuing to be the case is another aspect puzzling me. > > --- a/xen/arch/arm/efi/efi-boot.h > +++ b/xen/arch/arm/efi/efi-boot.h > @@ -183,13 +183,15 @@ static EFI_STATUS __init efi_process_mem > > for ( Index = 0; Index < (mmap_size / desc_size); Index++ ) > { > - if ( desc_ptr->Attribute & EFI_MEMORY_WB && > - (desc_ptr->Type == EfiConventionalMemory || > - desc_ptr->Type == EfiLoaderCode || > - desc_ptr->Type == EfiLoaderData || > - (!map_bs && > - (desc_ptr->Type == EfiBootServicesCode || > - desc_ptr->Type == EfiBootServicesData))) ) > + if ( desc_ptr->Attribute & EFI_MEMORY_RUNTIME ) > + /* nothing */; > + else if ( (desc_ptr->Attribute & EFI_MEMORY_WB) && > + (desc_ptr->Type == EfiConventionalMemory || > + desc_ptr->Type == EfiLoaderCode || > + desc_ptr->Type == EfiLoaderData || > + (!map_bs && > + (desc_ptr->Type == EfiBootServicesCode || > + desc_ptr->Type == EfiBootServicesData))) ) > { > if ( !meminfo_add_bank(&bootinfo.mem, desc_ptr) ) > { > --- a/xen/arch/x86/efi/efi-boot.h > +++ b/xen/arch/x86/efi/efi-boot.h > @@ -185,7 +185,9 @@ static void __init efi_arch_process_memo > /* fall through */ > case EfiLoaderCode: > case EfiLoaderData: > - if ( desc->Attribute & EFI_MEMORY_WB ) > + if ( desc->Attribute & EFI_MEMORY_RUNTIME ) > + type = E820_RESERVED; > + else if ( desc->Attribute & EFI_MEMORY_WB ) > type = E820_RAM; > else > case EfiUnusableMemory:
On 30/09/2022 08:50, Jan Beulich wrote: > efi_init_memory() in both relevant places is treating EFI_MEMORY_RUNTIME > higher priority than the type of the range. To avoid accessing memory at > runtime which was re-used for other purposes, make > efi_arch_process_memory_map() follow suit. While on x86 in theory the > same would apply to EfiACPIReclaimMemory, we don't actually "reclaim" > E820_ACPI memory there and hence that type's handling can be left alone. > > Fixes: bf6501a62e80 ("x86-64: EFI boot code") > Fixes: facac0af87ef ("x86-64: EFI runtime code") > Fixes: 6d70ea10d49f ("Add ARM EFI boot support") > Signed-off-by: Jan Beulich <jbeulich@suse.com> Isn't this also liable to be the root cause of the crash reported on IRC, where a read-only page was being inserted into the heap? ~Andrew
On 30.09.2022 14:53, Andrew Cooper wrote: > On 30/09/2022 08:50, Jan Beulich wrote: >> efi_init_memory() in both relevant places is treating EFI_MEMORY_RUNTIME >> higher priority than the type of the range. To avoid accessing memory at >> runtime which was re-used for other purposes, make >> efi_arch_process_memory_map() follow suit. While on x86 in theory the >> same would apply to EfiACPIReclaimMemory, we don't actually "reclaim" >> E820_ACPI memory there and hence that type's handling can be left alone. >> >> Fixes: bf6501a62e80 ("x86-64: EFI boot code") >> Fixes: facac0af87ef ("x86-64: EFI runtime code") >> Fixes: 6d70ea10d49f ("Add ARM EFI boot support") >> Signed-off-by: Jan Beulich <jbeulich@suse.com> > > Isn't this also liable to be the root cause of the crash reported on > IRC, where a read-only page was being inserted into the heap? I wouldn't be able to make the connection, I'm afraid. What was asked there was lacking details, though. Jan
Hi Andrew, > On 30 Sep 2022, at 14:53, Andrew Cooper <Andrew.Cooper3@citrix.com> wrote: > > On 30/09/2022 08:50, Jan Beulich wrote: >> efi_init_memory() in both relevant places is treating EFI_MEMORY_RUNTIME >> higher priority than the type of the range. To avoid accessing memory at >> runtime which was re-used for other purposes, make >> efi_arch_process_memory_map() follow suit. While on x86 in theory the >> same would apply to EfiACPIReclaimMemory, we don't actually "reclaim" >> E820_ACPI memory there and hence that type's handling can be left alone. >> >> Fixes: bf6501a62e80 ("x86-64: EFI boot code") >> Fixes: facac0af87ef ("x86-64: EFI runtime code") >> Fixes: 6d70ea10d49f ("Add ARM EFI boot support") >> Signed-off-by: Jan Beulich <jbeulich@suse.com> > > Isn't this also liable to be the root cause of the crash reported on > IRC, where a read-only page was being inserted into the heap? No this should not be related at all. Cheers Bertrand > > ~Andrew
On Fri, Sep 30, 2022 at 09:50:40AM +0200, Jan Beulich wrote: > efi_init_memory() in both relevant places is treating EFI_MEMORY_RUNTIME > higher priority than the type of the range. To avoid accessing memory at > runtime which was re-used for other purposes, make > efi_arch_process_memory_map() follow suit. While on x86 in theory the > same would apply to EfiACPIReclaimMemory, we don't actually "reclaim" > E820_ACPI memory there and hence that type's handling can be left alone. What about dom0? Should it be translated to E820_RESERVED so that dom0 doesn't try to use it either? I guess using E820_RESERVED could also confuse dom0 about ACPI data placement. Thanks, Roger.
On 30.09.2022 16:28, Roger Pau Monné wrote: > On Fri, Sep 30, 2022 at 09:50:40AM +0200, Jan Beulich wrote: >> efi_init_memory() in both relevant places is treating EFI_MEMORY_RUNTIME >> higher priority than the type of the range. To avoid accessing memory at >> runtime which was re-used for other purposes, make >> efi_arch_process_memory_map() follow suit. While on x86 in theory the >> same would apply to EfiACPIReclaimMemory, we don't actually "reclaim" >> E820_ACPI memory there and hence that type's handling can be left alone. > > What about dom0? Should it be translated to E820_RESERVED so that > dom0 doesn't try to use it either? I'm afraid I don't understand the questions. Not the least because I think "it" can't really mean "dom0" from the earlier sentence. Jan > I guess using E820_RESERVED > could also confuse dom0 about ACPI data placement. > > Thanks, Roger.
On Tue, Oct 04, 2022 at 10:06:36AM +0200, Jan Beulich wrote: > On 30.09.2022 16:28, Roger Pau Monné wrote: > > On Fri, Sep 30, 2022 at 09:50:40AM +0200, Jan Beulich wrote: > >> efi_init_memory() in both relevant places is treating EFI_MEMORY_RUNTIME > >> higher priority than the type of the range. To avoid accessing memory at > >> runtime which was re-used for other purposes, make > >> efi_arch_process_memory_map() follow suit. While on x86 in theory the > >> same would apply to EfiACPIReclaimMemory, we don't actually "reclaim" > >> E820_ACPI memory there and hence that type's handling can be left alone. > > > > What about dom0? Should it be translated to E820_RESERVED so that > > dom0 doesn't try to use it either? > > I'm afraid I don't understand the questions. Not the least because I > think "it" can't really mean "dom0" from the earlier sentence. Sorry, let me try again: The memory map provided to dom0 will contain E820_ACPI entries for memory ranges with the EFI_MEMORY_RUNTIME attributes in the EFI memory map. Is there a risk from dom0 reclaiming such E820_ACPI ranges, overwriting the data needed for runtime services? Thanks, Roger.
On 04.10.2022 11:33, Roger Pau Monné wrote: > On Tue, Oct 04, 2022 at 10:06:36AM +0200, Jan Beulich wrote: >> On 30.09.2022 16:28, Roger Pau Monné wrote: >>> On Fri, Sep 30, 2022 at 09:50:40AM +0200, Jan Beulich wrote: >>>> efi_init_memory() in both relevant places is treating EFI_MEMORY_RUNTIME >>>> higher priority than the type of the range. To avoid accessing memory at >>>> runtime which was re-used for other purposes, make >>>> efi_arch_process_memory_map() follow suit. While on x86 in theory the >>>> same would apply to EfiACPIReclaimMemory, we don't actually "reclaim" >>>> E820_ACPI memory there and hence that type's handling can be left alone. >>> >>> What about dom0? Should it be translated to E820_RESERVED so that >>> dom0 doesn't try to use it either? >> >> I'm afraid I don't understand the questions. Not the least because I >> think "it" can't really mean "dom0" from the earlier sentence. > > Sorry, let me try again: > > The memory map provided to dom0 will contain E820_ACPI entries for > memory ranges with the EFI_MEMORY_RUNTIME attributes in the EFI memory > map. Is there a risk from dom0 reclaiming such E820_ACPI ranges, > overwriting the data needed for runtime services? How would Dom0 go about doing so? It has no control over what we hand to the page allocator - it can only free pages which were actually allocated to it. E820_ACPI and E820_RESERVED pages are assigned to DomIO - Dom0 can map and access them, but it cannot free them. Jan
On Tue, Oct 04, 2022 at 12:23:23PM +0200, Jan Beulich wrote: > On 04.10.2022 11:33, Roger Pau Monné wrote: > > On Tue, Oct 04, 2022 at 10:06:36AM +0200, Jan Beulich wrote: > >> On 30.09.2022 16:28, Roger Pau Monné wrote: > >>> On Fri, Sep 30, 2022 at 09:50:40AM +0200, Jan Beulich wrote: > >>>> efi_init_memory() in both relevant places is treating EFI_MEMORY_RUNTIME > >>>> higher priority than the type of the range. To avoid accessing memory at > >>>> runtime which was re-used for other purposes, make > >>>> efi_arch_process_memory_map() follow suit. While on x86 in theory the > >>>> same would apply to EfiACPIReclaimMemory, we don't actually "reclaim" > >>>> E820_ACPI memory there and hence that type's handling can be left alone. > >>> > >>> What about dom0? Should it be translated to E820_RESERVED so that > >>> dom0 doesn't try to use it either? > >> > >> I'm afraid I don't understand the questions. Not the least because I > >> think "it" can't really mean "dom0" from the earlier sentence. > > > > Sorry, let me try again: > > > > The memory map provided to dom0 will contain E820_ACPI entries for > > memory ranges with the EFI_MEMORY_RUNTIME attributes in the EFI memory > > map. Is there a risk from dom0 reclaiming such E820_ACPI ranges, > > overwriting the data needed for runtime services? > > How would Dom0 go about doing so? It has no control over what we hand > to the page allocator - it can only free pages which were actually > allocated to it. E820_ACPI and E820_RESERVED pages are assigned to > DomIO - Dom0 can map and access them, but it cannot free them. Maybe I'm very confused, but what about dom0 overwriting the data there, won't it cause issues to runtime services? If the memory is reported in the memory map provided to dom0 as E820_ACPI dom0 is free to reclaim the region for it's own usage. Thanks, Roger.
On 04.10.2022 12:38, Roger Pau Monné wrote: > On Tue, Oct 04, 2022 at 12:23:23PM +0200, Jan Beulich wrote: >> On 04.10.2022 11:33, Roger Pau Monné wrote: >>> On Tue, Oct 04, 2022 at 10:06:36AM +0200, Jan Beulich wrote: >>>> On 30.09.2022 16:28, Roger Pau Monné wrote: >>>>> On Fri, Sep 30, 2022 at 09:50:40AM +0200, Jan Beulich wrote: >>>>>> efi_init_memory() in both relevant places is treating EFI_MEMORY_RUNTIME >>>>>> higher priority than the type of the range. To avoid accessing memory at >>>>>> runtime which was re-used for other purposes, make >>>>>> efi_arch_process_memory_map() follow suit. While on x86 in theory the >>>>>> same would apply to EfiACPIReclaimMemory, we don't actually "reclaim" >>>>>> E820_ACPI memory there and hence that type's handling can be left alone. >>>>> >>>>> What about dom0? Should it be translated to E820_RESERVED so that >>>>> dom0 doesn't try to use it either? >>>> >>>> I'm afraid I don't understand the questions. Not the least because I >>>> think "it" can't really mean "dom0" from the earlier sentence. >>> >>> Sorry, let me try again: >>> >>> The memory map provided to dom0 will contain E820_ACPI entries for >>> memory ranges with the EFI_MEMORY_RUNTIME attributes in the EFI memory >>> map. Is there a risk from dom0 reclaiming such E820_ACPI ranges, >>> overwriting the data needed for runtime services? >> >> How would Dom0 go about doing so? It has no control over what we hand >> to the page allocator - it can only free pages which were actually >> allocated to it. E820_ACPI and E820_RESERVED pages are assigned to >> DomIO - Dom0 can map and access them, but it cannot free them. > > Maybe I'm very confused, but what about dom0 overwriting the data > there, won't it cause issues to runtime services? If it overwrites it, of course there are going to be issues. Just like there are going to be problems from anything else Dom0 does wrong. > If the memory is reported in the memory map provided to dom0 as > E820_ACPI dom0 is free to reclaim the region for it's own usage. Could you outline to me how such a "reclaim" process would look like? For the range to become ordinary RAM, Xen needs to be involved. But there's no hypercall allowing Dom0 to free a page which wasn't allocated to it. And the Dom0 kernel simply re-using the range as if it was RAM is flawed - it would break the latest once Dom0 would try to balloon out such a page. Jan
On 04/10/2022 11:23, Jan Beulich wrote: > On 04.10.2022 11:33, Roger Pau Monné wrote: >> On Tue, Oct 04, 2022 at 10:06:36AM +0200, Jan Beulich wrote: >>> On 30.09.2022 16:28, Roger Pau Monné wrote: >>>> On Fri, Sep 30, 2022 at 09:50:40AM +0200, Jan Beulich wrote: >>>>> efi_init_memory() in both relevant places is treating EFI_MEMORY_RUNTIME >>>>> higher priority than the type of the range. To avoid accessing memory at >>>>> runtime which was re-used for other purposes, make >>>>> efi_arch_process_memory_map() follow suit. While on x86 in theory the >>>>> same would apply to EfiACPIReclaimMemory, we don't actually "reclaim" >>>>> E820_ACPI memory there and hence that type's handling can be left alone. >>>> What about dom0? Should it be translated to E820_RESERVED so that >>>> dom0 doesn't try to use it either? >>> I'm afraid I don't understand the questions. Not the least because I >>> think "it" can't really mean "dom0" from the earlier sentence. >> Sorry, let me try again: >> >> The memory map provided to dom0 will contain E820_ACPI entries for >> memory ranges with the EFI_MEMORY_RUNTIME attributes in the EFI memory >> map. Is there a risk from dom0 reclaiming such E820_ACPI ranges, >> overwriting the data needed for runtime services? > How would Dom0 go about doing so? Zeroing the memory and putting it into its own heap. You seem to be presuming that some unwritten and unenforced rules exist. Once dom0 has booted, Xen cannot safety touch any ACPI reclaimable range. It doesn't go wrong in practice because OSes don't actually reclaim the reclaimable ranges (which is also why Xen HVM guests don't explode either.) ~Andrew
On Tue, Oct 04, 2022 at 12:44:16PM +0200, Jan Beulich wrote: > On 04.10.2022 12:38, Roger Pau Monné wrote: > > On Tue, Oct 04, 2022 at 12:23:23PM +0200, Jan Beulich wrote: > >> On 04.10.2022 11:33, Roger Pau Monné wrote: > >>> On Tue, Oct 04, 2022 at 10:06:36AM +0200, Jan Beulich wrote: > >>>> On 30.09.2022 16:28, Roger Pau Monné wrote: > >>>>> On Fri, Sep 30, 2022 at 09:50:40AM +0200, Jan Beulich wrote: > >>>>>> efi_init_memory() in both relevant places is treating EFI_MEMORY_RUNTIME > >>>>>> higher priority than the type of the range. To avoid accessing memory at > >>>>>> runtime which was re-used for other purposes, make > >>>>>> efi_arch_process_memory_map() follow suit. While on x86 in theory the > >>>>>> same would apply to EfiACPIReclaimMemory, we don't actually "reclaim" > >>>>>> E820_ACPI memory there and hence that type's handling can be left alone. > >>>>> > >>>>> What about dom0? Should it be translated to E820_RESERVED so that > >>>>> dom0 doesn't try to use it either? > >>>> > >>>> I'm afraid I don't understand the questions. Not the least because I > >>>> think "it" can't really mean "dom0" from the earlier sentence. > >>> > >>> Sorry, let me try again: > >>> > >>> The memory map provided to dom0 will contain E820_ACPI entries for > >>> memory ranges with the EFI_MEMORY_RUNTIME attributes in the EFI memory > >>> map. Is there a risk from dom0 reclaiming such E820_ACPI ranges, > >>> overwriting the data needed for runtime services? > >> > >> How would Dom0 go about doing so? It has no control over what we hand > >> to the page allocator - it can only free pages which were actually > >> allocated to it. E820_ACPI and E820_RESERVED pages are assigned to > >> DomIO - Dom0 can map and access them, but it cannot free them. > > > > Maybe I'm very confused, but what about dom0 overwriting the data > > there, won't it cause issues to runtime services? > > If it overwrites it, of course there are going to be issues. Just like > there are going to be problems from anything else Dom0 does wrong. But would dom0 know it's doing something wrong? The region is just marked as E820_ACPI from dom0 PoV, so it doesn't know it's required by EFI runtime services, and dom0 could legitimately overwrite the region once it considers all ACPI parsing done from it's side. Thanks, Roger.
On 04.10.2022 12:49, Andrew Cooper wrote: > On 04/10/2022 11:23, Jan Beulich wrote: >> On 04.10.2022 11:33, Roger Pau Monné wrote: >>> On Tue, Oct 04, 2022 at 10:06:36AM +0200, Jan Beulich wrote: >>>> On 30.09.2022 16:28, Roger Pau Monné wrote: >>>>> On Fri, Sep 30, 2022 at 09:50:40AM +0200, Jan Beulich wrote: >>>>>> efi_init_memory() in both relevant places is treating EFI_MEMORY_RUNTIME >>>>>> higher priority than the type of the range. To avoid accessing memory at >>>>>> runtime which was re-used for other purposes, make >>>>>> efi_arch_process_memory_map() follow suit. While on x86 in theory the >>>>>> same would apply to EfiACPIReclaimMemory, we don't actually "reclaim" >>>>>> E820_ACPI memory there and hence that type's handling can be left alone. >>>>> What about dom0? Should it be translated to E820_RESERVED so that >>>>> dom0 doesn't try to use it either? >>>> I'm afraid I don't understand the questions. Not the least because I >>>> think "it" can't really mean "dom0" from the earlier sentence. >>> Sorry, let me try again: >>> >>> The memory map provided to dom0 will contain E820_ACPI entries for >>> memory ranges with the EFI_MEMORY_RUNTIME attributes in the EFI memory >>> map. Is there a risk from dom0 reclaiming such E820_ACPI ranges, >>> overwriting the data needed for runtime services? >> How would Dom0 go about doing so? > > Zeroing the memory and putting it into its own heap. This makes no sense. > You seem to be presuming that some unwritten and unenforced rules exist. There's the PV interface. All memory management related functionality has to be based on the E820 map handed to the domain (irrespective of it being Dom0 or DomU). That map has no E820_ACPI entries for PV (and PVH Dom0 is as of yet unsupported). Dom0 is also handed the host E820, but it is supposed to use it only for resource management (e.g. to determine where BARs may be placed, or how to arrange its PFN space). > Once dom0 has booted, Xen cannot safety touch any ACPI reclaimable > range. I clearly disagree, and I expect existing behavior also disagrees with you. Jan > It doesn't go wrong in practice because OSes don't actually > reclaim the reclaimable ranges (which is also why Xen HVM guests don't > explode either.) > > ~Andrew
On 04.10.2022 12:54, Roger Pau Monné wrote: > On Tue, Oct 04, 2022 at 12:44:16PM +0200, Jan Beulich wrote: >> On 04.10.2022 12:38, Roger Pau Monné wrote: >>> On Tue, Oct 04, 2022 at 12:23:23PM +0200, Jan Beulich wrote: >>>> On 04.10.2022 11:33, Roger Pau Monné wrote: >>>>> On Tue, Oct 04, 2022 at 10:06:36AM +0200, Jan Beulich wrote: >>>>>> On 30.09.2022 16:28, Roger Pau Monné wrote: >>>>>>> On Fri, Sep 30, 2022 at 09:50:40AM +0200, Jan Beulich wrote: >>>>>>>> efi_init_memory() in both relevant places is treating EFI_MEMORY_RUNTIME >>>>>>>> higher priority than the type of the range. To avoid accessing memory at >>>>>>>> runtime which was re-used for other purposes, make >>>>>>>> efi_arch_process_memory_map() follow suit. While on x86 in theory the >>>>>>>> same would apply to EfiACPIReclaimMemory, we don't actually "reclaim" >>>>>>>> E820_ACPI memory there and hence that type's handling can be left alone. >>>>>>> >>>>>>> What about dom0? Should it be translated to E820_RESERVED so that >>>>>>> dom0 doesn't try to use it either? >>>>>> >>>>>> I'm afraid I don't understand the questions. Not the least because I >>>>>> think "it" can't really mean "dom0" from the earlier sentence. >>>>> >>>>> Sorry, let me try again: >>>>> >>>>> The memory map provided to dom0 will contain E820_ACPI entries for >>>>> memory ranges with the EFI_MEMORY_RUNTIME attributes in the EFI memory >>>>> map. Is there a risk from dom0 reclaiming such E820_ACPI ranges, >>>>> overwriting the data needed for runtime services? >>>> >>>> How would Dom0 go about doing so? It has no control over what we hand >>>> to the page allocator - it can only free pages which were actually >>>> allocated to it. E820_ACPI and E820_RESERVED pages are assigned to >>>> DomIO - Dom0 can map and access them, but it cannot free them. >>> >>> Maybe I'm very confused, but what about dom0 overwriting the data >>> there, won't it cause issues to runtime services? >> >> If it overwrites it, of course there are going to be issues. Just like >> there are going to be problems from anything else Dom0 does wrong. > > But would dom0 know it's doing something wrong? Yes. Please also see my reply to Andrew. > The region is just marked as E820_ACPI from dom0 PoV, so it doesn't > know it's required by EFI runtime services, and dom0 could > legitimately overwrite the region once it considers all ACPI parsing > done from it's side. PV Dom0 won't ever see E820_ACPI in the relevant E820 map; this type can only appear in the machine E820. In how far PVH Dom0 might need to take special care I can't tell right now (but at least for kexec purposes I expect Linux isn't going to recycle E820_ACPI regions even going forward). Jan
On Tue, Oct 04, 2022 at 02:18:31PM +0200, Jan Beulich wrote: > On 04.10.2022 12:54, Roger Pau Monné wrote: > > On Tue, Oct 04, 2022 at 12:44:16PM +0200, Jan Beulich wrote: > >> On 04.10.2022 12:38, Roger Pau Monné wrote: > >>> On Tue, Oct 04, 2022 at 12:23:23PM +0200, Jan Beulich wrote: > >>>> On 04.10.2022 11:33, Roger Pau Monné wrote: > >>>>> On Tue, Oct 04, 2022 at 10:06:36AM +0200, Jan Beulich wrote: > >>>>>> On 30.09.2022 16:28, Roger Pau Monné wrote: > >>>>>>> On Fri, Sep 30, 2022 at 09:50:40AM +0200, Jan Beulich wrote: > >>>>>>>> efi_init_memory() in both relevant places is treating EFI_MEMORY_RUNTIME > >>>>>>>> higher priority than the type of the range. To avoid accessing memory at > >>>>>>>> runtime which was re-used for other purposes, make > >>>>>>>> efi_arch_process_memory_map() follow suit. While on x86 in theory the > >>>>>>>> same would apply to EfiACPIReclaimMemory, we don't actually "reclaim" > >>>>>>>> E820_ACPI memory there and hence that type's handling can be left alone. > >>>>>>> > >>>>>>> What about dom0? Should it be translated to E820_RESERVED so that > >>>>>>> dom0 doesn't try to use it either? > >>>>>> > >>>>>> I'm afraid I don't understand the questions. Not the least because I > >>>>>> think "it" can't really mean "dom0" from the earlier sentence. > >>>>> > >>>>> Sorry, let me try again: > >>>>> > >>>>> The memory map provided to dom0 will contain E820_ACPI entries for > >>>>> memory ranges with the EFI_MEMORY_RUNTIME attributes in the EFI memory > >>>>> map. Is there a risk from dom0 reclaiming such E820_ACPI ranges, > >>>>> overwriting the data needed for runtime services? > >>>> > >>>> How would Dom0 go about doing so? It has no control over what we hand > >>>> to the page allocator - it can only free pages which were actually > >>>> allocated to it. E820_ACPI and E820_RESERVED pages are assigned to > >>>> DomIO - Dom0 can map and access them, but it cannot free them. > >>> > >>> Maybe I'm very confused, but what about dom0 overwriting the data > >>> there, won't it cause issues to runtime services? > >> > >> If it overwrites it, of course there are going to be issues. Just like > >> there are going to be problems from anything else Dom0 does wrong. > > > > But would dom0 know it's doing something wrong? > > Yes. Please also see my reply to Andrew. > > > The region is just marked as E820_ACPI from dom0 PoV, so it doesn't > > know it's required by EFI runtime services, and dom0 could > > legitimately overwrite the region once it considers all ACPI parsing > > done from it's side. > > PV Dom0 won't ever see E820_ACPI in the relevant E820 map; this type can > only appear in the machine E820. In how far PVH Dom0 might need to take > special care I can't tell right now (but at least for kexec purposes I > expect Linux isn't going to recycle E820_ACPI regions even going forward). Even if unlikely, couldn't some dom0 OS look at the machine map after processing ACPI and just decide to overwrite the ACPI regions? Not that it's useful from an OS PoV, but also we have no statement saying that E820_ACPI in the machine memory map shouldn't be overwritten. Thanks, Roger.
On 04.10.2022 14:52, Roger Pau Monné wrote: > On Tue, Oct 04, 2022 at 02:18:31PM +0200, Jan Beulich wrote: >> On 04.10.2022 12:54, Roger Pau Monné wrote: >>> On Tue, Oct 04, 2022 at 12:44:16PM +0200, Jan Beulich wrote: >>>> On 04.10.2022 12:38, Roger Pau Monné wrote: >>>>> On Tue, Oct 04, 2022 at 12:23:23PM +0200, Jan Beulich wrote: >>>>>> On 04.10.2022 11:33, Roger Pau Monné wrote: >>>>>>> On Tue, Oct 04, 2022 at 10:06:36AM +0200, Jan Beulich wrote: >>>>>>>> On 30.09.2022 16:28, Roger Pau Monné wrote: >>>>>>>>> On Fri, Sep 30, 2022 at 09:50:40AM +0200, Jan Beulich wrote: >>>>>>>>>> efi_init_memory() in both relevant places is treating EFI_MEMORY_RUNTIME >>>>>>>>>> higher priority than the type of the range. To avoid accessing memory at >>>>>>>>>> runtime which was re-used for other purposes, make >>>>>>>>>> efi_arch_process_memory_map() follow suit. While on x86 in theory the >>>>>>>>>> same would apply to EfiACPIReclaimMemory, we don't actually "reclaim" >>>>>>>>>> E820_ACPI memory there and hence that type's handling can be left alone. >>>>>>>>> >>>>>>>>> What about dom0? Should it be translated to E820_RESERVED so that >>>>>>>>> dom0 doesn't try to use it either? >>>>>>>> >>>>>>>> I'm afraid I don't understand the questions. Not the least because I >>>>>>>> think "it" can't really mean "dom0" from the earlier sentence. >>>>>>> >>>>>>> Sorry, let me try again: >>>>>>> >>>>>>> The memory map provided to dom0 will contain E820_ACPI entries for >>>>>>> memory ranges with the EFI_MEMORY_RUNTIME attributes in the EFI memory >>>>>>> map. Is there a risk from dom0 reclaiming such E820_ACPI ranges, >>>>>>> overwriting the data needed for runtime services? >>>>>> >>>>>> How would Dom0 go about doing so? It has no control over what we hand >>>>>> to the page allocator - it can only free pages which were actually >>>>>> allocated to it. E820_ACPI and E820_RESERVED pages are assigned to >>>>>> DomIO - Dom0 can map and access them, but it cannot free them. >>>>> >>>>> Maybe I'm very confused, but what about dom0 overwriting the data >>>>> there, won't it cause issues to runtime services? >>>> >>>> If it overwrites it, of course there are going to be issues. Just like >>>> there are going to be problems from anything else Dom0 does wrong. >>> >>> But would dom0 know it's doing something wrong? >> >> Yes. Please also see my reply to Andrew. >> >>> The region is just marked as E820_ACPI from dom0 PoV, so it doesn't >>> know it's required by EFI runtime services, and dom0 could >>> legitimately overwrite the region once it considers all ACPI parsing >>> done from it's side. >> >> PV Dom0 won't ever see E820_ACPI in the relevant E820 map; this type can >> only appear in the machine E820. In how far PVH Dom0 might need to take >> special care I can't tell right now (but at least for kexec purposes I >> expect Linux isn't going to recycle E820_ACPI regions even going forward). > > Even if unlikely, couldn't some dom0 OS look at the machine map after > processing ACPI and just decide to overwrite the ACPI regions? > > Not that it's useful from an OS PoV, but also we have no statement > saying that E820_ACPI in the machine memory map shouldn't be > overwritten. There are many things we have no statements for, yet we imply certain behavior or restrictions. The machine memory map, imo, clearly isn't intended for this kind of use. Jan
On Tue, Oct 04, 2022 at 03:10:57PM +0200, Jan Beulich wrote: > On 04.10.2022 14:52, Roger Pau Monné wrote: > > On Tue, Oct 04, 2022 at 02:18:31PM +0200, Jan Beulich wrote: > >> On 04.10.2022 12:54, Roger Pau Monné wrote: > >>> On Tue, Oct 04, 2022 at 12:44:16PM +0200, Jan Beulich wrote: > >>>> On 04.10.2022 12:38, Roger Pau Monné wrote: > >>>>> On Tue, Oct 04, 2022 at 12:23:23PM +0200, Jan Beulich wrote: > >>>>>> On 04.10.2022 11:33, Roger Pau Monné wrote: > >>>>>>> On Tue, Oct 04, 2022 at 10:06:36AM +0200, Jan Beulich wrote: > >>>>>>>> On 30.09.2022 16:28, Roger Pau Monné wrote: > >>>>>>>>> On Fri, Sep 30, 2022 at 09:50:40AM +0200, Jan Beulich wrote: > >>>>>>>>>> efi_init_memory() in both relevant places is treating EFI_MEMORY_RUNTIME > >>>>>>>>>> higher priority than the type of the range. To avoid accessing memory at > >>>>>>>>>> runtime which was re-used for other purposes, make > >>>>>>>>>> efi_arch_process_memory_map() follow suit. While on x86 in theory the > >>>>>>>>>> same would apply to EfiACPIReclaimMemory, we don't actually "reclaim" > >>>>>>>>>> E820_ACPI memory there and hence that type's handling can be left alone. > >>>>>>>>> > >>>>>>>>> What about dom0? Should it be translated to E820_RESERVED so that > >>>>>>>>> dom0 doesn't try to use it either? > >>>>>>>> > >>>>>>>> I'm afraid I don't understand the questions. Not the least because I > >>>>>>>> think "it" can't really mean "dom0" from the earlier sentence. > >>>>>>> > >>>>>>> Sorry, let me try again: > >>>>>>> > >>>>>>> The memory map provided to dom0 will contain E820_ACPI entries for > >>>>>>> memory ranges with the EFI_MEMORY_RUNTIME attributes in the EFI memory > >>>>>>> map. Is there a risk from dom0 reclaiming such E820_ACPI ranges, > >>>>>>> overwriting the data needed for runtime services? > >>>>>> > >>>>>> How would Dom0 go about doing so? It has no control over what we hand > >>>>>> to the page allocator - it can only free pages which were actually > >>>>>> allocated to it. E820_ACPI and E820_RESERVED pages are assigned to > >>>>>> DomIO - Dom0 can map and access them, but it cannot free them. > >>>>> > >>>>> Maybe I'm very confused, but what about dom0 overwriting the data > >>>>> there, won't it cause issues to runtime services? > >>>> > >>>> If it overwrites it, of course there are going to be issues. Just like > >>>> there are going to be problems from anything else Dom0 does wrong. > >>> > >>> But would dom0 know it's doing something wrong? > >> > >> Yes. Please also see my reply to Andrew. > >> > >>> The region is just marked as E820_ACPI from dom0 PoV, so it doesn't > >>> know it's required by EFI runtime services, and dom0 could > >>> legitimately overwrite the region once it considers all ACPI parsing > >>> done from it's side. > >> > >> PV Dom0 won't ever see E820_ACPI in the relevant E820 map; this type can > >> only appear in the machine E820. In how far PVH Dom0 might need to take > >> special care I can't tell right now (but at least for kexec purposes I > >> expect Linux isn't going to recycle E820_ACPI regions even going forward). > > > > Even if unlikely, couldn't some dom0 OS look at the machine map after > > processing ACPI and just decide to overwrite the ACPI regions? > > > > Not that it's useful from an OS PoV, but also we have no statement > > saying that E820_ACPI in the machine memory map shouldn't be > > overwritten. > > There are many things we have no statements for, yet we imply certain > behavior or restrictions. The machine memory map, imo, clearly isn't > intended for this kind of use. There isn't much I can say then. I do feel we are creating rules out of thin air. I do think the commit message should mention that we rely on dom0 not overwriting the data in the E820_ACPI regions on the machine memory map. Thanks, Roger.
On 04.10.2022 16:01, Roger Pau Monné wrote: > On Tue, Oct 04, 2022 at 03:10:57PM +0200, Jan Beulich wrote: >> On 04.10.2022 14:52, Roger Pau Monné wrote: >>> On Tue, Oct 04, 2022 at 02:18:31PM +0200, Jan Beulich wrote: >>>> On 04.10.2022 12:54, Roger Pau Monné wrote: >>>>> On Tue, Oct 04, 2022 at 12:44:16PM +0200, Jan Beulich wrote: >>>>>> On 04.10.2022 12:38, Roger Pau Monné wrote: >>>>>>> On Tue, Oct 04, 2022 at 12:23:23PM +0200, Jan Beulich wrote: >>>>>>>> On 04.10.2022 11:33, Roger Pau Monné wrote: >>>>>>>>> On Tue, Oct 04, 2022 at 10:06:36AM +0200, Jan Beulich wrote: >>>>>>>>>> On 30.09.2022 16:28, Roger Pau Monné wrote: >>>>>>>>>>> On Fri, Sep 30, 2022 at 09:50:40AM +0200, Jan Beulich wrote: >>>>>>>>>>>> efi_init_memory() in both relevant places is treating EFI_MEMORY_RUNTIME >>>>>>>>>>>> higher priority than the type of the range. To avoid accessing memory at >>>>>>>>>>>> runtime which was re-used for other purposes, make >>>>>>>>>>>> efi_arch_process_memory_map() follow suit. While on x86 in theory the >>>>>>>>>>>> same would apply to EfiACPIReclaimMemory, we don't actually "reclaim" >>>>>>>>>>>> E820_ACPI memory there and hence that type's handling can be left alone. >>>>>>>>>>> >>>>>>>>>>> What about dom0? Should it be translated to E820_RESERVED so that >>>>>>>>>>> dom0 doesn't try to use it either? >>>>>>>>>> >>>>>>>>>> I'm afraid I don't understand the questions. Not the least because I >>>>>>>>>> think "it" can't really mean "dom0" from the earlier sentence. >>>>>>>>> >>>>>>>>> Sorry, let me try again: >>>>>>>>> >>>>>>>>> The memory map provided to dom0 will contain E820_ACPI entries for >>>>>>>>> memory ranges with the EFI_MEMORY_RUNTIME attributes in the EFI memory >>>>>>>>> map. Is there a risk from dom0 reclaiming such E820_ACPI ranges, >>>>>>>>> overwriting the data needed for runtime services? >>>>>>>> >>>>>>>> How would Dom0 go about doing so? It has no control over what we hand >>>>>>>> to the page allocator - it can only free pages which were actually >>>>>>>> allocated to it. E820_ACPI and E820_RESERVED pages are assigned to >>>>>>>> DomIO - Dom0 can map and access them, but it cannot free them. >>>>>>> >>>>>>> Maybe I'm very confused, but what about dom0 overwriting the data >>>>>>> there, won't it cause issues to runtime services? >>>>>> >>>>>> If it overwrites it, of course there are going to be issues. Just like >>>>>> there are going to be problems from anything else Dom0 does wrong. >>>>> >>>>> But would dom0 know it's doing something wrong? >>>> >>>> Yes. Please also see my reply to Andrew. >>>> >>>>> The region is just marked as E820_ACPI from dom0 PoV, so it doesn't >>>>> know it's required by EFI runtime services, and dom0 could >>>>> legitimately overwrite the region once it considers all ACPI parsing >>>>> done from it's side. >>>> >>>> PV Dom0 won't ever see E820_ACPI in the relevant E820 map; this type can >>>> only appear in the machine E820. In how far PVH Dom0 might need to take >>>> special care I can't tell right now (but at least for kexec purposes I >>>> expect Linux isn't going to recycle E820_ACPI regions even going forward). >>> >>> Even if unlikely, couldn't some dom0 OS look at the machine map after >>> processing ACPI and just decide to overwrite the ACPI regions? >>> >>> Not that it's useful from an OS PoV, but also we have no statement >>> saying that E820_ACPI in the machine memory map shouldn't be >>> overwritten. >> >> There are many things we have no statements for, yet we imply certain >> behavior or restrictions. The machine memory map, imo, clearly isn't >> intended for this kind of use. > > There isn't much I can say then. I do feel we are creating rules out > of thin air. > > I do think the commit message should mention that we rely on dom0 not > overwriting the data in the E820_ACPI regions on the machine memory > map. Hmm, am I getting it right that you think I need to add further justification for a change I'm _not_ making? And which, if we wanted to change our behavior, would require a similar change (or perhaps a change elsewhere) in E820 (i.e. non-EFI) handling? The modification I'm making is solely towards Xen's internal memory management. I'm really having a hard time seeing how commenting on expected Dom0 behavior would fit here (leaving aside that I'm still puzzled by both you and Andrew thinking that there's any whatsoever remote indication anywhere that Dom0 recycling E820_ACPI could be an okay thing in a PV Dom0 kernel). The more that marking EfiACPIReclaimMemory anything other than E820_ACPI might, as iirc you did say yourself, also confuse e.g. the ACPI subsystem of Dom0's kernel. But well, would extending that sentence to "While on x86 in theory the same would apply to EfiACPIReclaimMemory, we don't actually "reclaim" E820_ACPI memory there (and it would be a bug if the Dom0 kernel tried to do so, bypassing Xen's memory management), hence that type's handling can be left alone" satisfy your request? Jan
On Tue, Oct 04, 2022 at 04:39:26PM +0200, Jan Beulich wrote: > On 04.10.2022 16:01, Roger Pau Monné wrote: > > On Tue, Oct 04, 2022 at 03:10:57PM +0200, Jan Beulich wrote: > >> On 04.10.2022 14:52, Roger Pau Monné wrote: > >>> On Tue, Oct 04, 2022 at 02:18:31PM +0200, Jan Beulich wrote: > >>>> On 04.10.2022 12:54, Roger Pau Monné wrote: > >>>>> On Tue, Oct 04, 2022 at 12:44:16PM +0200, Jan Beulich wrote: > >>>>>> On 04.10.2022 12:38, Roger Pau Monné wrote: > >>>>>>> On Tue, Oct 04, 2022 at 12:23:23PM +0200, Jan Beulich wrote: > >>>>>>>> On 04.10.2022 11:33, Roger Pau Monné wrote: > >>>>>>>>> On Tue, Oct 04, 2022 at 10:06:36AM +0200, Jan Beulich wrote: > >>>>>>>>>> On 30.09.2022 16:28, Roger Pau Monné wrote: > >>>>>>>>>>> On Fri, Sep 30, 2022 at 09:50:40AM +0200, Jan Beulich wrote: > >>>>>>>>>>>> efi_init_memory() in both relevant places is treating EFI_MEMORY_RUNTIME > >>>>>>>>>>>> higher priority than the type of the range. To avoid accessing memory at > >>>>>>>>>>>> runtime which was re-used for other purposes, make > >>>>>>>>>>>> efi_arch_process_memory_map() follow suit. While on x86 in theory the > >>>>>>>>>>>> same would apply to EfiACPIReclaimMemory, we don't actually "reclaim" > >>>>>>>>>>>> E820_ACPI memory there and hence that type's handling can be left alone. > >>>>>>>>>>> > >>>>>>>>>>> What about dom0? Should it be translated to E820_RESERVED so that > >>>>>>>>>>> dom0 doesn't try to use it either? > >>>>>>>>>> > >>>>>>>>>> I'm afraid I don't understand the questions. Not the least because I > >>>>>>>>>> think "it" can't really mean "dom0" from the earlier sentence. > >>>>>>>>> > >>>>>>>>> Sorry, let me try again: > >>>>>>>>> > >>>>>>>>> The memory map provided to dom0 will contain E820_ACPI entries for > >>>>>>>>> memory ranges with the EFI_MEMORY_RUNTIME attributes in the EFI memory > >>>>>>>>> map. Is there a risk from dom0 reclaiming such E820_ACPI ranges, > >>>>>>>>> overwriting the data needed for runtime services? > >>>>>>>> > >>>>>>>> How would Dom0 go about doing so? It has no control over what we hand > >>>>>>>> to the page allocator - it can only free pages which were actually > >>>>>>>> allocated to it. E820_ACPI and E820_RESERVED pages are assigned to > >>>>>>>> DomIO - Dom0 can map and access them, but it cannot free them. > >>>>>>> > >>>>>>> Maybe I'm very confused, but what about dom0 overwriting the data > >>>>>>> there, won't it cause issues to runtime services? > >>>>>> > >>>>>> If it overwrites it, of course there are going to be issues. Just like > >>>>>> there are going to be problems from anything else Dom0 does wrong. > >>>>> > >>>>> But would dom0 know it's doing something wrong? > >>>> > >>>> Yes. Please also see my reply to Andrew. > >>>> > >>>>> The region is just marked as E820_ACPI from dom0 PoV, so it doesn't > >>>>> know it's required by EFI runtime services, and dom0 could > >>>>> legitimately overwrite the region once it considers all ACPI parsing > >>>>> done from it's side. > >>>> > >>>> PV Dom0 won't ever see E820_ACPI in the relevant E820 map; this type can > >>>> only appear in the machine E820. In how far PVH Dom0 might need to take > >>>> special care I can't tell right now (but at least for kexec purposes I > >>>> expect Linux isn't going to recycle E820_ACPI regions even going forward). > >>> > >>> Even if unlikely, couldn't some dom0 OS look at the machine map after > >>> processing ACPI and just decide to overwrite the ACPI regions? > >>> > >>> Not that it's useful from an OS PoV, but also we have no statement > >>> saying that E820_ACPI in the machine memory map shouldn't be > >>> overwritten. > >> > >> There are many things we have no statements for, yet we imply certain > >> behavior or restrictions. The machine memory map, imo, clearly isn't > >> intended for this kind of use. > > > > There isn't much I can say then. I do feel we are creating rules out > > of thin air. > > > > I do think the commit message should mention that we rely on dom0 not > > overwriting the data in the E820_ACPI regions on the machine memory > > map. > > Hmm, am I getting it right that you think I need to add further > justification for a change I'm _not_ making? In the commit message you explicitly mentioned 'we don't actually "reclaim" E820_ACPI memory' and I assumed that "we" in the sentence to only include Xen. Now I see that the "we" there seems to include both Xen and the dom0 kernel. This wasn't clear to me at first sight. > And which, if we wanted > to change our behavior, would require a similar change (or perhaps a > change elsewhere) in E820 (i.e. non-EFI) handling? Why would that be required? Without EFI dom0 should be fine in overwriting (some?) of the data in E820_ACPI regions once it's finished with all ACPI processing, as a region of type E820_ACPI is reclaimable and Xen won't try to access it once handled to dom0. > The modification > I'm making is solely towards Xen's internal memory management. I'm > really having a hard time seeing how commenting on expected Dom0 > behavior would fit here The type in the e820 memory map also gets propagated to dom0 in the machine memory map hypercall, so it can have effect outside of Xen itself. > (leaving aside that I'm still puzzled by both > you and Andrew thinking that there's any whatsoever remote indication > anywhere that Dom0 recycling E820_ACPI could be an okay thing in a PV > Dom0 kernel). The more that marking EfiACPIReclaimMemory anything > other than E820_ACPI might, as iirc you did say yourself, also confuse > e.g. the ACPI subsystem of Dom0's kernel. Indeed. There's no good way to convert a region of type EfiACPIReclaimMemory that has the EFI_MEMORY_RUNTIME attribute set, as there's no mapping to an e820 type. One of the quirks of trying to retrofit an EFI memory map into e820 format. > But well, would extending that sentence to "While on x86 in theory the > same would apply to EfiACPIReclaimMemory, we don't actually "reclaim" > E820_ACPI memory there (and it would be a bug if the Dom0 kernel tried > to do so, bypassing Xen's memory management), hence that type's > handling can be left alone" satisfy your request? I think that would indeed make it clearer. Thanks, Roger.
On 04.10.2022 17:20, Roger Pau Monné wrote: > On Tue, Oct 04, 2022 at 04:39:26PM +0200, Jan Beulich wrote: >> On 04.10.2022 16:01, Roger Pau Monné wrote: >>> On Tue, Oct 04, 2022 at 03:10:57PM +0200, Jan Beulich wrote: >>>> On 04.10.2022 14:52, Roger Pau Monné wrote: >>>>> On Tue, Oct 04, 2022 at 02:18:31PM +0200, Jan Beulich wrote: >>>>>> On 04.10.2022 12:54, Roger Pau Monné wrote: >>>>>>> On Tue, Oct 04, 2022 at 12:44:16PM +0200, Jan Beulich wrote: >>>>>>>> On 04.10.2022 12:38, Roger Pau Monné wrote: >>>>>>>>> On Tue, Oct 04, 2022 at 12:23:23PM +0200, Jan Beulich wrote: >>>>>>>>>> On 04.10.2022 11:33, Roger Pau Monné wrote: >>>>>>>>>>> On Tue, Oct 04, 2022 at 10:06:36AM +0200, Jan Beulich wrote: >>>>>>>>>>>> On 30.09.2022 16:28, Roger Pau Monné wrote: >>>>>>>>>>>>> On Fri, Sep 30, 2022 at 09:50:40AM +0200, Jan Beulich wrote: >>>>>>>>>>>>>> efi_init_memory() in both relevant places is treating EFI_MEMORY_RUNTIME >>>>>>>>>>>>>> higher priority than the type of the range. To avoid accessing memory at >>>>>>>>>>>>>> runtime which was re-used for other purposes, make >>>>>>>>>>>>>> efi_arch_process_memory_map() follow suit. While on x86 in theory the >>>>>>>>>>>>>> same would apply to EfiACPIReclaimMemory, we don't actually "reclaim" >>>>>>>>>>>>>> E820_ACPI memory there and hence that type's handling can be left alone. >>>>>>>>>>>>> >>>>>>>>>>>>> What about dom0? Should it be translated to E820_RESERVED so that >>>>>>>>>>>>> dom0 doesn't try to use it either? >>>>>>>>>>>> >>>>>>>>>>>> I'm afraid I don't understand the questions. Not the least because I >>>>>>>>>>>> think "it" can't really mean "dom0" from the earlier sentence. >>>>>>>>>>> >>>>>>>>>>> Sorry, let me try again: >>>>>>>>>>> >>>>>>>>>>> The memory map provided to dom0 will contain E820_ACPI entries for >>>>>>>>>>> memory ranges with the EFI_MEMORY_RUNTIME attributes in the EFI memory >>>>>>>>>>> map. Is there a risk from dom0 reclaiming such E820_ACPI ranges, >>>>>>>>>>> overwriting the data needed for runtime services? >>>>>>>>>> >>>>>>>>>> How would Dom0 go about doing so? It has no control over what we hand >>>>>>>>>> to the page allocator - it can only free pages which were actually >>>>>>>>>> allocated to it. E820_ACPI and E820_RESERVED pages are assigned to >>>>>>>>>> DomIO - Dom0 can map and access them, but it cannot free them. >>>>>>>>> >>>>>>>>> Maybe I'm very confused, but what about dom0 overwriting the data >>>>>>>>> there, won't it cause issues to runtime services? >>>>>>>> >>>>>>>> If it overwrites it, of course there are going to be issues. Just like >>>>>>>> there are going to be problems from anything else Dom0 does wrong. >>>>>>> >>>>>>> But would dom0 know it's doing something wrong? >>>>>> >>>>>> Yes. Please also see my reply to Andrew. >>>>>> >>>>>>> The region is just marked as E820_ACPI from dom0 PoV, so it doesn't >>>>>>> know it's required by EFI runtime services, and dom0 could >>>>>>> legitimately overwrite the region once it considers all ACPI parsing >>>>>>> done from it's side. >>>>>> >>>>>> PV Dom0 won't ever see E820_ACPI in the relevant E820 map; this type can >>>>>> only appear in the machine E820. In how far PVH Dom0 might need to take >>>>>> special care I can't tell right now (but at least for kexec purposes I >>>>>> expect Linux isn't going to recycle E820_ACPI regions even going forward). >>>>> >>>>> Even if unlikely, couldn't some dom0 OS look at the machine map after >>>>> processing ACPI and just decide to overwrite the ACPI regions? >>>>> >>>>> Not that it's useful from an OS PoV, but also we have no statement >>>>> saying that E820_ACPI in the machine memory map shouldn't be >>>>> overwritten. >>>> >>>> There are many things we have no statements for, yet we imply certain >>>> behavior or restrictions. The machine memory map, imo, clearly isn't >>>> intended for this kind of use. >>> >>> There isn't much I can say then. I do feel we are creating rules out >>> of thin air. >>> >>> I do think the commit message should mention that we rely on dom0 not >>> overwriting the data in the E820_ACPI regions on the machine memory >>> map. >> >> Hmm, am I getting it right that you think I need to add further >> justification for a change I'm _not_ making? > > In the commit message you explicitly mentioned 'we don't actually > "reclaim" E820_ACPI memory' and I assumed that "we" in the sentence to > only include Xen. Now I see that the "we" there seems to include both > Xen and the dom0 kernel. This wasn't clear to me at first sight. It was clear, actually, as I did mean Xen alone. It didn't even occur to me that one could consider Dom0 potentially trying to do so. >> And which, if we wanted >> to change our behavior, would require a similar change (or perhaps a >> change elsewhere) in E820 (i.e. non-EFI) handling? > > Why would that be required? Because if EFI can (ab)use that type for other purposes, why couldn't legacy firmware, too? > Without EFI dom0 should be fine in overwriting (some?) of the data in > E820_ACPI regions once it's finished with all ACPI processing, as a > region of type E820_ACPI is reclaimable and Xen won't try to access it > once handled to dom0. > >> The modification >> I'm making is solely towards Xen's internal memory management. I'm >> really having a hard time seeing how commenting on expected Dom0 >> behavior would fit here > > The type in the e820 memory map also gets propagated to dom0 in the > machine memory map hypercall, so it can have effect outside of Xen > itself. If used beyond the very limited intended purposes, yes. >> (leaving aside that I'm still puzzled by both >> you and Andrew thinking that there's any whatsoever remote indication >> anywhere that Dom0 recycling E820_ACPI could be an okay thing in a PV >> Dom0 kernel). The more that marking EfiACPIReclaimMemory anything >> other than E820_ACPI might, as iirc you did say yourself, also confuse >> e.g. the ACPI subsystem of Dom0's kernel. > > Indeed. There's no good way to convert a region of type > EfiACPIReclaimMemory that has the EFI_MEMORY_RUNTIME attribute set, as > there's no mapping to an e820 type. > > One of the quirks of trying to retrofit an EFI memory map into e820 > format. > >> But well, would extending that sentence to "While on x86 in theory the >> same would apply to EfiACPIReclaimMemory, we don't actually "reclaim" >> E820_ACPI memory there (and it would be a bug if the Dom0 kernel tried >> to do so, bypassing Xen's memory management), hence that type's >> handling can be left alone" satisfy your request? > > I think that would indeed make it clearer. Okay, I'll make the adjustment then and submit a v2. This will now need an ack also by Henry anyway. Jan
On 30.09.2022 14:51, Bertrand Marquis wrote: >> On 30 Sep 2022, at 09:50, Jan Beulich <jbeulich@suse.com> wrote: >> >> efi_init_memory() in both relevant places is treating EFI_MEMORY_RUNTIME >> higher priority than the type of the range. To avoid accessing memory at >> runtime which was re-used for other purposes, make >> efi_arch_process_memory_map() follow suit. While on x86 in theory the >> same would apply to EfiACPIReclaimMemory, we don't actually "reclaim" >> E820_ACPI memory there and hence that type's handling can be left alone. >> >> Fixes: bf6501a62e80 ("x86-64: EFI boot code") >> Fixes: facac0af87ef ("x86-64: EFI runtime code") >> Fixes: 6d70ea10d49f ("Add ARM EFI boot support") >> Signed-off-by: Jan Beulich <jbeulich@suse.com> > > Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com> #arm Thanks. However ... >> --- >> Partly RFC for Arm, for two reasons: >> >> On Arm I question the conversion of EfiACPIReclaimMemory, in two ways: >> For one like on x86 such ranges would likely better be retained, as Dom0 >> may (will?) have a need to look at tables placed there. Plus converting >> such ranges to RAM even if EFI_MEMORY_WB is not set looks suspicious to >> me as well. I'd be inclined to make the latter adjustment right here >> (while the other change probably would better be separate, if there >> aren't actually reasons for the present behavior). ... any views on this WB aspect at least (also Stefano or Julien)? Would be good to know before I send v2. Jan
Hi Jan, On 04/10/2022 16:58, Jan Beulich wrote: > On 30.09.2022 14:51, Bertrand Marquis wrote: >>> On 30 Sep 2022, at 09:50, Jan Beulich <jbeulich@suse.com> wrote: >>> >>> efi_init_memory() in both relevant places is treating EFI_MEMORY_RUNTIME >>> higher priority than the type of the range. To avoid accessing memory at >>> runtime which was re-used for other purposes, make >>> efi_arch_process_memory_map() follow suit. While on x86 in theory the >>> same would apply to EfiACPIReclaimMemory, we don't actually "reclaim" >>> E820_ACPI memory there and hence that type's handling can be left alone. >>> >>> Fixes: bf6501a62e80 ("x86-64: EFI boot code") >>> Fixes: facac0af87ef ("x86-64: EFI runtime code") >>> Fixes: 6d70ea10d49f ("Add ARM EFI boot support") >>> Signed-off-by: Jan Beulich <jbeulich@suse.com> >> >> Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com> #arm > > Thanks. However ... > >>> --- >>> Partly RFC for Arm, for two reasons: >>> >>> On Arm I question the conversion of EfiACPIReclaimMemory, in two ways: >>> For one like on x86 such ranges would likely better be retained, as Dom0 >>> may (will?) have a need to look at tables placed there. Plus converting >>> such ranges to RAM even if EFI_MEMORY_WB is not set looks suspicious to >>> me as well. I'd be inclined to make the latter adjustment right here >>> (while the other change probably would better be separate, if there >>> aren't actually reasons for the present behavior). > > ... any views on this WB aspect at least (also Stefano or Julien)? Would be > good to know before I send v2. I don't quite understand what you are questioning here. Looking at the code, EfiACPIReclaimMemory will not be converted to RAM but added in a separate array. Furthermore, all the EfiACPIReclaimMemory regions will be passed to dom0 (see acpi_create_efi_mmap_table()). So to me the code looks correct. Cheers,
On 05.10.2022 12:44, Julien Grall wrote: > On 04/10/2022 16:58, Jan Beulich wrote: >> On 30.09.2022 14:51, Bertrand Marquis wrote: >>>> On 30 Sep 2022, at 09:50, Jan Beulich <jbeulich@suse.com> wrote: >>>> >>>> efi_init_memory() in both relevant places is treating EFI_MEMORY_RUNTIME >>>> higher priority than the type of the range. To avoid accessing memory at >>>> runtime which was re-used for other purposes, make >>>> efi_arch_process_memory_map() follow suit. While on x86 in theory the >>>> same would apply to EfiACPIReclaimMemory, we don't actually "reclaim" >>>> E820_ACPI memory there and hence that type's handling can be left alone. >>>> >>>> Fixes: bf6501a62e80 ("x86-64: EFI boot code") >>>> Fixes: facac0af87ef ("x86-64: EFI runtime code") >>>> Fixes: 6d70ea10d49f ("Add ARM EFI boot support") >>>> Signed-off-by: Jan Beulich <jbeulich@suse.com> >>> >>> Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com> #arm >> >> Thanks. However ... >> >>>> --- >>>> Partly RFC for Arm, for two reasons: >>>> >>>> On Arm I question the conversion of EfiACPIReclaimMemory, in two ways: >>>> For one like on x86 such ranges would likely better be retained, as Dom0 >>>> may (will?) have a need to look at tables placed there. Plus converting >>>> such ranges to RAM even if EFI_MEMORY_WB is not set looks suspicious to >>>> me as well. I'd be inclined to make the latter adjustment right here >>>> (while the other change probably would better be separate, if there >>>> aren't actually reasons for the present behavior). >> >> ... any views on this WB aspect at least (also Stefano or Julien)? Would be >> good to know before I send v2. > > I don't quite understand what you are questioning here. Looking at the > code, EfiACPIReclaimMemory will not be converted to RAM but added in a > separate array. > > Furthermore, all the EfiACPIReclaimMemory regions will be passed to dom0 > (see acpi_create_efi_mmap_table()). > > So to me the code looks correct. Oh, I've indeed not paid enough attention to the first argument passed to meminfo_add_bank(). I'm sorry for the extra noise. However, the question I wanted to have addressed before sending out v3 was that regarding the present using of memory when EFI_MEMORY_WB is not set. Is that correct for the EfiACPIReclaimMemory case, i.e. is the consumer (Dom0) aware that there might be a restriction? And would this memory then be guaranteed to never be freed into the general pool of RAM pages? Jan
Hi Jan, On 05/10/2022 12:55, Jan Beulich wrote: > On 05.10.2022 12:44, Julien Grall wrote: >> On 04/10/2022 16:58, Jan Beulich wrote: >>> On 30.09.2022 14:51, Bertrand Marquis wrote: >>>>> On 30 Sep 2022, at 09:50, Jan Beulich <jbeulich@suse.com> wrote: >>>>> >>>>> efi_init_memory() in both relevant places is treating EFI_MEMORY_RUNTIME >>>>> higher priority than the type of the range. To avoid accessing memory at >>>>> runtime which was re-used for other purposes, make >>>>> efi_arch_process_memory_map() follow suit. While on x86 in theory the >>>>> same would apply to EfiACPIReclaimMemory, we don't actually "reclaim" >>>>> E820_ACPI memory there and hence that type's handling can be left alone. >>>>> >>>>> Fixes: bf6501a62e80 ("x86-64: EFI boot code") >>>>> Fixes: facac0af87ef ("x86-64: EFI runtime code") >>>>> Fixes: 6d70ea10d49f ("Add ARM EFI boot support") >>>>> Signed-off-by: Jan Beulich <jbeulich@suse.com> >>>> >>>> Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com> #arm >>> >>> Thanks. However ... >>> >>>>> --- >>>>> Partly RFC for Arm, for two reasons: >>>>> >>>>> On Arm I question the conversion of EfiACPIReclaimMemory, in two ways: >>>>> For one like on x86 such ranges would likely better be retained, as Dom0 >>>>> may (will?) have a need to look at tables placed there. Plus converting >>>>> such ranges to RAM even if EFI_MEMORY_WB is not set looks suspicious to >>>>> me as well. I'd be inclined to make the latter adjustment right here >>>>> (while the other change probably would better be separate, if there >>>>> aren't actually reasons for the present behavior). >>> >>> ... any views on this WB aspect at least (also Stefano or Julien)? Would be >>> good to know before I send v2. >> >> I don't quite understand what you are questioning here. Looking at the >> code, EfiACPIReclaimMemory will not be converted to RAM but added in a >> separate array. >> >> Furthermore, all the EfiACPIReclaimMemory regions will be passed to dom0 >> (see acpi_create_efi_mmap_table()). >> >> So to me the code looks correct. > > Oh, I've indeed not paid enough attention to the first argument passed > to meminfo_add_bank(). I'm sorry for the extra noise. However, the > question I wanted to have addressed before sending out v3 was that > regarding the present using of memory when EFI_MEMORY_WB is not set. > Is that correct for the EfiACPIReclaimMemory case, i.e. is the > consumer (Dom0) aware that there might be a restriction? Looking at the code, we always set EFI_MEMORY_WB for the reclaimable region and the stage-2 mapping will be cachable. So it looks like there would be a mismatch if EFI_MEMORY_WB is not set. However, given the region is reclaimable, shouldn't this imply that the flag is always set? > And would > this memory then be guaranteed to never be freed into the general pool > of RAM pages? The region is not treated as RAM by Xen and not owned by the dom0. Therefore, it should not be possible to free the page because get_page_from_gfn() would not be able to get a reference. Cheers,
On 05.10.2022 20:09, Julien Grall wrote: > Hi Jan, > > On 05/10/2022 12:55, Jan Beulich wrote: >> On 05.10.2022 12:44, Julien Grall wrote: >>> On 04/10/2022 16:58, Jan Beulich wrote: >>>> On 30.09.2022 14:51, Bertrand Marquis wrote: >>>>>> On 30 Sep 2022, at 09:50, Jan Beulich <jbeulich@suse.com> wrote: >>>>>> >>>>>> efi_init_memory() in both relevant places is treating EFI_MEMORY_RUNTIME >>>>>> higher priority than the type of the range. To avoid accessing memory at >>>>>> runtime which was re-used for other purposes, make >>>>>> efi_arch_process_memory_map() follow suit. While on x86 in theory the >>>>>> same would apply to EfiACPIReclaimMemory, we don't actually "reclaim" >>>>>> E820_ACPI memory there and hence that type's handling can be left alone. >>>>>> >>>>>> Fixes: bf6501a62e80 ("x86-64: EFI boot code") >>>>>> Fixes: facac0af87ef ("x86-64: EFI runtime code") >>>>>> Fixes: 6d70ea10d49f ("Add ARM EFI boot support") >>>>>> Signed-off-by: Jan Beulich <jbeulich@suse.com> >>>>> >>>>> Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com> #arm >>>> >>>> Thanks. However ... >>>> >>>>>> --- >>>>>> Partly RFC for Arm, for two reasons: >>>>>> >>>>>> On Arm I question the conversion of EfiACPIReclaimMemory, in two ways: >>>>>> For one like on x86 such ranges would likely better be retained, as Dom0 >>>>>> may (will?) have a need to look at tables placed there. Plus converting >>>>>> such ranges to RAM even if EFI_MEMORY_WB is not set looks suspicious to >>>>>> me as well. I'd be inclined to make the latter adjustment right here >>>>>> (while the other change probably would better be separate, if there >>>>>> aren't actually reasons for the present behavior). >>>> >>>> ... any views on this WB aspect at least (also Stefano or Julien)? Would be >>>> good to know before I send v2. >>> >>> I don't quite understand what you are questioning here. Looking at the >>> code, EfiACPIReclaimMemory will not be converted to RAM but added in a >>> separate array. >>> >>> Furthermore, all the EfiACPIReclaimMemory regions will be passed to dom0 >>> (see acpi_create_efi_mmap_table()). >>> >>> So to me the code looks correct. >> >> Oh, I've indeed not paid enough attention to the first argument passed >> to meminfo_add_bank(). I'm sorry for the extra noise. However, the >> question I wanted to have addressed before sending out v3 was that >> regarding the present using of memory when EFI_MEMORY_WB is not set. >> Is that correct for the EfiACPIReclaimMemory case, i.e. is the >> consumer (Dom0) aware that there might be a restriction? > > Looking at the code, we always set EFI_MEMORY_WB for the reclaimable > region and the stage-2 mapping will be cachable. > > So it looks like there would be a mismatch if EFI_MEMORY_WB is not set. > However, given the region is reclaimable, shouldn't this imply that the > flag is always set? Possibly (but then again consider [perhaps hypothetical] systems with e.g. just WT caches, where specifying WB simply wouldn't make sense). In any event, even if that's the case, being on the safe side and doing if ( (desc_ptr->Attribute & EFI_MEMORY_RUNTIME) || !(desc_ptr->Attribute & EFI_MEMORY_WB) ) /* nothing */; else if ( ... would seem better to me. However, if the mapping you mention above would be adjusted and ... >> And would >> this memory then be guaranteed to never be freed into the general pool >> of RAM pages? > > The region is not treated as RAM by Xen and not owned by the dom0. > Therefore, it should not be possible to free the page because > get_page_from_gfn() would not be able to get a reference. ... the space cannot become ordinary RAM, then such a precaution wouldn't be necessary. After all hiding EfiACPIReclaimMemory from Dom0 just because it can't be mapped WB wouldn't be very nice either. I guess I'll submit v2 with this part of the change left as it was. Jan
On 06.10.2022 10:39, Jan Beulich wrote: > On 05.10.2022 20:09, Julien Grall wrote: >> Hi Jan, >> >> On 05/10/2022 12:55, Jan Beulich wrote: >>> On 05.10.2022 12:44, Julien Grall wrote: >>>> On 04/10/2022 16:58, Jan Beulich wrote: >>>>> On 30.09.2022 14:51, Bertrand Marquis wrote: >>>>>>> On 30 Sep 2022, at 09:50, Jan Beulich <jbeulich@suse.com> wrote: >>>>>>> >>>>>>> efi_init_memory() in both relevant places is treating EFI_MEMORY_RUNTIME >>>>>>> higher priority than the type of the range. To avoid accessing memory at >>>>>>> runtime which was re-used for other purposes, make >>>>>>> efi_arch_process_memory_map() follow suit. While on x86 in theory the >>>>>>> same would apply to EfiACPIReclaimMemory, we don't actually "reclaim" >>>>>>> E820_ACPI memory there and hence that type's handling can be left alone. >>>>>>> >>>>>>> Fixes: bf6501a62e80 ("x86-64: EFI boot code") >>>>>>> Fixes: facac0af87ef ("x86-64: EFI runtime code") >>>>>>> Fixes: 6d70ea10d49f ("Add ARM EFI boot support") >>>>>>> Signed-off-by: Jan Beulich <jbeulich@suse.com> >>>>>> >>>>>> Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com> #arm >>>>> >>>>> Thanks. However ... >>>>> >>>>>>> --- >>>>>>> Partly RFC for Arm, for two reasons: >>>>>>> >>>>>>> On Arm I question the conversion of EfiACPIReclaimMemory, in two ways: >>>>>>> For one like on x86 such ranges would likely better be retained, as Dom0 >>>>>>> may (will?) have a need to look at tables placed there. Plus converting >>>>>>> such ranges to RAM even if EFI_MEMORY_WB is not set looks suspicious to >>>>>>> me as well. I'd be inclined to make the latter adjustment right here >>>>>>> (while the other change probably would better be separate, if there >>>>>>> aren't actually reasons for the present behavior). >>>>> >>>>> ... any views on this WB aspect at least (also Stefano or Julien)? Would be >>>>> good to know before I send v2. >>>> >>>> I don't quite understand what you are questioning here. Looking at the >>>> code, EfiACPIReclaimMemory will not be converted to RAM but added in a >>>> separate array. >>>> >>>> Furthermore, all the EfiACPIReclaimMemory regions will be passed to dom0 >>>> (see acpi_create_efi_mmap_table()). >>>> >>>> So to me the code looks correct. >>> >>> Oh, I've indeed not paid enough attention to the first argument passed >>> to meminfo_add_bank(). I'm sorry for the extra noise. However, the >>> question I wanted to have addressed before sending out v3 was that >>> regarding the present using of memory when EFI_MEMORY_WB is not set. >>> Is that correct for the EfiACPIReclaimMemory case, i.e. is the >>> consumer (Dom0) aware that there might be a restriction? >> >> Looking at the code, we always set EFI_MEMORY_WB for the reclaimable >> region and the stage-2 mapping will be cachable. >> >> So it looks like there would be a mismatch if EFI_MEMORY_WB is not set. >> However, given the region is reclaimable, shouldn't this imply that the >> flag is always set? > > Possibly (but then again consider [perhaps hypothetical] systems with e.g. > just WT caches, where specifying WB simply wouldn't make sense). In any > event, even if that's the case, being on the safe side and doing > > if ( (desc_ptr->Attribute & EFI_MEMORY_RUNTIME) || > !(desc_ptr->Attribute & EFI_MEMORY_WB) ) > /* nothing */; > else if ( ... > > would seem better to me. However, if the mapping you mention above > would be adjusted and ... > >>> And would >>> this memory then be guaranteed to never be freed into the general pool >>> of RAM pages? >> >> The region is not treated as RAM by Xen and not owned by the dom0. >> Therefore, it should not be possible to free the page because >> get_page_from_gfn() would not be able to get a reference. > > ... the space cannot become ordinary RAM, then such a precaution > wouldn't be necessary. After all hiding EfiACPIReclaimMemory from > Dom0 just because it can't be mapped WB wouldn't be very nice > either. I guess I'll submit v2 with this part of the change left > as it was. And while already in the process of committing the patch I came to realize that if the WB conditional isn't supposed to move, isn't the change done for Arm then wrong as well? Shouldn't it then be if ( !(desc_ptr->Attribute & EFI_MEMORY_RUNTIME) && (desc_ptr->Attribute & EFI_MEMORY_WB) && (desc_ptr->Type == EfiConventionalMemory || ... leaving the EfiACPIReclaimMemory case entirely unaffected by the change? Jan
Hi Jan, On 06/10/2022 15:11, Jan Beulich wrote: >> ... the space cannot become ordinary RAM, then such a precaution >> wouldn't be necessary. After all hiding EfiACPIReclaimMemory from >> Dom0 just because it can't be mapped WB wouldn't be very nice >> either. I guess I'll submit v2 with this part of the change left >> as it was. > > And while already in the process of committing the patch I came to > realize that if the WB conditional isn't supposed to move, isn't > the change done for Arm then wrong as well? Shouldn't it then be > > if ( !(desc_ptr->Attribute & EFI_MEMORY_RUNTIME) && > (desc_ptr->Attribute & EFI_MEMORY_WB) && > (desc_ptr->Type == EfiConventionalMemory || > ... > > leaving the EfiACPIReclaimMemory case entirely unaffected by the > change? IIUC, the concern is the region EfiACPIReclaimMemory could have the attribute EFI_MEMORY_RUNTIME. Is that correct? Given that the memory is reclaimable, I am not sure why it can also have this atribute set (to me it means the opposite). But I guess for hardening purpose it would be better to use the version you just suggested. Bertrand, Stefano, what do you think? Cheers,
On 08.10.2022 21:08, Julien Grall wrote: > On 06/10/2022 15:11, Jan Beulich wrote: >>> ... the space cannot become ordinary RAM, then such a precaution >>> wouldn't be necessary. After all hiding EfiACPIReclaimMemory from >>> Dom0 just because it can't be mapped WB wouldn't be very nice >>> either. I guess I'll submit v2 with this part of the change left >>> as it was. >> >> And while already in the process of committing the patch I came to >> realize that if the WB conditional isn't supposed to move, isn't >> the change done for Arm then wrong as well? Shouldn't it then be >> >> if ( !(desc_ptr->Attribute & EFI_MEMORY_RUNTIME) && >> (desc_ptr->Attribute & EFI_MEMORY_WB) && >> (desc_ptr->Type == EfiConventionalMemory || >> ... >> >> leaving the EfiACPIReclaimMemory case entirely unaffected by the >> change? > > IIUC, the concern is the region EfiACPIReclaimMemory could have the attribute EFI_MEMORY_RUNTIME. Is that correct? Yes, ... > Given that the memory is reclaimable, I am not sure why it can also have this atribute set (to me it means the opposite). ... at least on x86 all sorts of strange/bogus type/attribute combinations have been observed. Jan > But I guess for hardening purpose it would be better to use the version you just suggested. > > Bertrand, Stefano, what do you think? > > Cheers, >
On Mon, 10 Oct 2022, Jan Beulich wrote: > On 08.10.2022 21:08, Julien Grall wrote: > > On 06/10/2022 15:11, Jan Beulich wrote: > >>> ... the space cannot become ordinary RAM, then such a precaution > >>> wouldn't be necessary. After all hiding EfiACPIReclaimMemory from > >>> Dom0 just because it can't be mapped WB wouldn't be very nice > >>> either. I guess I'll submit v2 with this part of the change left > >>> as it was. > >> > >> And while already in the process of committing the patch I came to > >> realize that if the WB conditional isn't supposed to move, isn't > >> the change done for Arm then wrong as well? Shouldn't it then be > >> > >> if ( !(desc_ptr->Attribute & EFI_MEMORY_RUNTIME) && > >> (desc_ptr->Attribute & EFI_MEMORY_WB) && > >> (desc_ptr->Type == EfiConventionalMemory || > >> ... > >> > >> leaving the EfiACPIReclaimMemory case entirely unaffected by the > >> change? > > > > IIUC, the concern is the region EfiACPIReclaimMemory could have the attribute EFI_MEMORY_RUNTIME. Is that correct? > > Yes, ... > > > Given that the memory is reclaimable, I am not sure why it can also have this atribute set (to me it means the opposite). > > ... at least on x86 all sorts of strange/bogus type/attribute combinations > have been observed. Yeah... it is a good idea to be able to cope with strange and bogus firmware tables as it is known to happen
Hi, > On 11 Oct 2022, at 00:58, Stefano Stabellini <sstabellini@kernel.org> wrote: > > On Mon, 10 Oct 2022, Jan Beulich wrote: >> On 08.10.2022 21:08, Julien Grall wrote: >>> On 06/10/2022 15:11, Jan Beulich wrote: >>>>> ... the space cannot become ordinary RAM, then such a precaution >>>>> wouldn't be necessary. After all hiding EfiACPIReclaimMemory from >>>>> Dom0 just because it can't be mapped WB wouldn't be very nice >>>>> either. I guess I'll submit v2 with this part of the change left >>>>> as it was. >>>> >>>> And while already in the process of committing the patch I came to >>>> realize that if the WB conditional isn't supposed to move, isn't >>>> the change done for Arm then wrong as well? Shouldn't it then be >>>> >>>> if ( !(desc_ptr->Attribute & EFI_MEMORY_RUNTIME) && >>>> (desc_ptr->Attribute & EFI_MEMORY_WB) && >>>> (desc_ptr->Type == EfiConventionalMemory || >>>> ... >>>> >>>> leaving the EfiACPIReclaimMemory case entirely unaffected by the >>>> change? >>> >>> IIUC, the concern is the region EfiACPIReclaimMemory could have the attribute EFI_MEMORY_RUNTIME. Is that correct? >> >> Yes, ... >> >>> Given that the memory is reclaimable, I am not sure why it can also have this atribute set (to me it means the opposite). >> >> ... at least on x86 all sorts of strange/bogus type/attribute combinations >> have been observed. > > Yeah... it is a good idea to be able to cope with strange and bogus > firmware tables as it is known to happen I agree with that but if we make an assumption that something is bogus, we should at least warn the user if possible. Bertrand
--- a/xen/arch/arm/efi/efi-boot.h +++ b/xen/arch/arm/efi/efi-boot.h @@ -183,13 +183,15 @@ static EFI_STATUS __init efi_process_mem for ( Index = 0; Index < (mmap_size / desc_size); Index++ ) { - if ( desc_ptr->Attribute & EFI_MEMORY_WB && - (desc_ptr->Type == EfiConventionalMemory || - desc_ptr->Type == EfiLoaderCode || - desc_ptr->Type == EfiLoaderData || - (!map_bs && - (desc_ptr->Type == EfiBootServicesCode || - desc_ptr->Type == EfiBootServicesData))) ) + if ( desc_ptr->Attribute & EFI_MEMORY_RUNTIME ) + /* nothing */; + else if ( (desc_ptr->Attribute & EFI_MEMORY_WB) && + (desc_ptr->Type == EfiConventionalMemory || + desc_ptr->Type == EfiLoaderCode || + desc_ptr->Type == EfiLoaderData || + (!map_bs && + (desc_ptr->Type == EfiBootServicesCode || + desc_ptr->Type == EfiBootServicesData))) ) { if ( !meminfo_add_bank(&bootinfo.mem, desc_ptr) ) { --- a/xen/arch/x86/efi/efi-boot.h +++ b/xen/arch/x86/efi/efi-boot.h @@ -185,7 +185,9 @@ static void __init efi_arch_process_memo /* fall through */ case EfiLoaderCode: case EfiLoaderData: - if ( desc->Attribute & EFI_MEMORY_WB ) + if ( desc->Attribute & EFI_MEMORY_RUNTIME ) + type = E820_RESERVED; + else if ( desc->Attribute & EFI_MEMORY_WB ) type = E820_RAM; else case EfiUnusableMemory:
efi_init_memory() in both relevant places is treating EFI_MEMORY_RUNTIME higher priority than the type of the range. To avoid accessing memory at runtime which was re-used for other purposes, make efi_arch_process_memory_map() follow suit. While on x86 in theory the same would apply to EfiACPIReclaimMemory, we don't actually "reclaim" E820_ACPI memory there and hence that type's handling can be left alone. Fixes: bf6501a62e80 ("x86-64: EFI boot code") Fixes: facac0af87ef ("x86-64: EFI runtime code") Fixes: 6d70ea10d49f ("Add ARM EFI boot support") Signed-off-by: Jan Beulich <jbeulich@suse.com> --- Partly RFC for Arm, for two reasons: On Arm I question the conversion of EfiACPIReclaimMemory, in two ways: For one like on x86 such ranges would likely better be retained, as Dom0 may (will?) have a need to look at tables placed there. Plus converting such ranges to RAM even if EFI_MEMORY_WB is not set looks suspicious to me as well. I'd be inclined to make the latter adjustment right here (while the other change probably would better be separate, if there aren't actually reasons for the present behavior). On Arm efi_init_memory() is compiled out, so adjusting Arm code here is perhaps more for consistency (not leaving a trap for someone to later fall into) than a strict requirement. I wonder though how Arm has managed to get away without at least some parts of efi_init_memory() for all the years that ACPI support has been present there. I guess this is connected to most of runtime.c also being compiled out, but that continuing to be the case is another aspect puzzling me.