Message ID | 1474599334-30709-1-git-send-email-peng.fan@nxp.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Stefano, Julien Any comments on this v4 patch? Thanks, Peng On Fri, Sep 23, 2016 at 10:55:34AM +0800, Peng Fan wrote: >On AArch64 SoCs, some IPs may only have the capability to access >32 bits address space. The physical memory assigned for Dom0 maybe >not in 4GB address space, then the IPs will not work properly. >So need to allocate memory under 4GB for Dom0. > >There is no restriction that how much lowmem needs to be allocated for >Dom0 ,so allocate lowmem as much as possible for Dom0. > >This patch does not affect 32-bit domain, because Variable "lowmem" is >set to true at the beginning. If failed to allocate bank0 under 4GB, >need to panic for 32-bit domain, because 32-bit domain requires bank0 >be allocated under 4GB. > >For 64-bit domain, set "lowmem" to false, and continue allocating >memory from above 4GB. > >Signed-off-by: Peng Fan <peng.fan@nxp.com> >Cc: Stefano Stabellini <sstabellini@kernel.org> >Cc: Julien Grall <julien.grall@arm.com> >--- > >This patch is to resolve the issue mentioned in >https://lists.xen.org/archives/html/xen-devel/2016-09/msg00235.html > > Tested results: > (XEN) Allocating 1:1 mappings totalling 2048MB for dom0: > (XEN) BANK[0] 0x00000088000000-0x000000f8000000 (1792MB) > (XEN) BANK[1] 0x000009e0000000-0x000009f0000000 (256MB) > 1792M allocated in 4GB address space. > >V4: > Address comments in V3: https://lists.xen.org/archives/html/xen-devel/2016-09/msg02499.html > Drop uneccessary check when failed to allocate memory under 4GB > Refine comments according to Julien's suggestion in V3. > Keep "bits <= (lowmem ? 32 : PADDR_BITS)", but not changed to "bits <= 32" > >V3: > Add more commit log > Add more comments > Add back panic if failed to allocate bank0 under 4GB for 32-bit domain. > >V2: > Remove the bootargs dom0_lowmem introduced in V1. > Following "https://lists.xenproject.org/archives/html/xen-devel/2016-09/msg01459.html" > to allocate as much as possible lowmem. > > xen/arch/arm/domain_build.c | 33 ++++++++++++++++++++++----------- > 1 file changed, 22 insertions(+), 11 deletions(-) > >diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c >index 35ab08d..6b5ac8d 100644 >--- a/xen/arch/arm/domain_build.c >+++ b/xen/arch/arm/domain_build.c >@@ -195,9 +195,9 @@ fail: > * bank. Partly this is just easier for us to deal with, but also > * the ramdisk and DTB must be placed within a certain proximity of > * the kernel within RAM. >- * 3. For 32-bit dom0 we want to place as much of the RAM as we >- * reasonably can below 4GB, so that it can be used by non-LPAE >- * enabled kernels. >+ * 3. For dom0 we want to place as much of the RAM as we reasonably can >+ * below 4GB, so that it can be used by non-LPAE enabled kernels (32-bit) >+ * or when a device assigned to dom0 can only do 32-bit DMA access. > * 4. For 32-bit dom0 the kernel must be located below 4GB. > * 5. We want to have a few largers banks rather than many smaller ones. > * >@@ -230,7 +230,8 @@ fail: > * we give up. > * > * For 32-bit domain we require that the initial allocation for the >- * first bank is under 4G. Then for the subsequent allocations we >+ * first bank is under 4G. For 64-bit domain, the first bank is preferred >+ * to be allocated under 4G. Then for the subsequent allocations we > * initially allocate memory only from below 4GB. Once that runs out > * (as described above) we allow higher allocations and continue until > * that runs out (or we have allocated sufficient dom0 memory). >@@ -244,7 +245,7 @@ static void allocate_memory(struct domain *d, struct kernel_info *kinfo) > unsigned int order = get_11_allocation_size(kinfo->unassigned_mem); > int i; > >- bool_t lowmem = is_32bit_domain(d); >+ bool_t lowmem = true; > unsigned int bits; > > /* >@@ -269,20 +270,30 @@ static void allocate_memory(struct domain *d, struct kernel_info *kinfo) > { > pg = alloc_domheap_pages(d, order, MEMF_bits(bits)); > if ( pg != NULL ) >+ { >+ if ( !insert_11_bank(d, kinfo, pg, order) ) >+ BUG(); /* Cannot fail for first bank */ >+ > goto got_bank0; >+ } > } > order--; > } > >- panic("Unable to allocate first memory bank"); >- >- got_bank0: >+ /* Failed to allocate bank0 under 4GB */ >+ if ( is_32bit_domain(d) ) >+ panic("Unable to allocate first memory bank."); > >- if ( !insert_11_bank(d, kinfo, pg, order) ) >- BUG(); /* Cannot fail for first bank */ >+ /* Try to allocate memory from above 4GB */ >+ printk(XENLOG_INFO "No bank has been allocated below 4GB.\n"); >+ lowmem = false; > >- /* Now allocate more memory and fill in additional banks */ >+ got_bank0: > >+ /* >+ * If we failed to allocate bank0 under 4GB, continue allocating >+ * memory from above 4GB and fill in banks. >+ */ > order = get_11_allocation_size(kinfo->unassigned_mem); > while ( kinfo->unassigned_mem && kinfo->mem.nr_banks < NR_MEM_BANKS ) > { >-- >2.6.6 >
Hi Peng, Sorry for the late answer. On 23/09/2016 03:55, Peng Fan wrote: > On AArch64 SoCs, some IPs may only have the capability to access > 32 bits address space. The physical memory assigned for Dom0 maybe > not in 4GB address space, then the IPs will not work properly. > So need to allocate memory under 4GB for Dom0. > > There is no restriction that how much lowmem needs to be allocated for > Dom0 ,so allocate lowmem as much as possible for Dom0. > > This patch does not affect 32-bit domain, because Variable "lowmem" is > set to true at the beginning. If failed to allocate bank0 under 4GB, > need to panic for 32-bit domain, because 32-bit domain requires bank0 > be allocated under 4GB. > > For 64-bit domain, set "lowmem" to false, and continue allocating > memory from above 4GB. > > Signed-off-by: Peng Fan <peng.fan@nxp.com> > Cc: Stefano Stabellini <sstabellini@kernel.org> > Cc: Julien Grall <julien.grall@arm.com> Reviewed-by: Julien Grall <julien.grall@arm.com> I am undecided whether this should be considered as a bug fix for Xen 4.8. Are you aware of any ARM64 platform we currently support requiring allocation of memory below 4GB? Regards,
On Tue, 1 Nov 2016, Julien Grall wrote: > Hi Peng, > > Sorry for the late answer. > > On 23/09/2016 03:55, Peng Fan wrote: > > On AArch64 SoCs, some IPs may only have the capability to access > > 32 bits address space. The physical memory assigned for Dom0 maybe > > not in 4GB address space, then the IPs will not work properly. > > So need to allocate memory under 4GB for Dom0. > > > > There is no restriction that how much lowmem needs to be allocated for > > Dom0 ,so allocate lowmem as much as possible for Dom0. > > > > This patch does not affect 32-bit domain, because Variable "lowmem" is > > set to true at the beginning. If failed to allocate bank0 under 4GB, > > need to panic for 32-bit domain, because 32-bit domain requires bank0 > > be allocated under 4GB. > > > > For 64-bit domain, set "lowmem" to false, and continue allocating > > memory from above 4GB. > > > > Signed-off-by: Peng Fan <peng.fan@nxp.com> > > Cc: Stefano Stabellini <sstabellini@kernel.org> > > Cc: Julien Grall <julien.grall@arm.com> > > Reviewed-by: Julien Grall <julien.grall@arm.com> > > I am undecided whether this should be considered as a bug fix for Xen 4.8. Are > you aware of any ARM64 platform we currently support requiring allocation of > memory below 4GB? I am more comfortable having this in 4.9 (I queued it up in xen-arm-next for now), unless we have a regression, a concrete problem, with an existing supported platform, like you wrote.
Hi Julien, Sorry for late reply. On Tue, Nov 01, 2016 at 02:42:06PM +0000, Julien Grall wrote: >Hi Peng, > >Sorry for the late answer. > >On 23/09/2016 03:55, Peng Fan wrote: >>On AArch64 SoCs, some IPs may only have the capability to access >>32 bits address space. The physical memory assigned for Dom0 maybe >>not in 4GB address space, then the IPs will not work properly. >>So need to allocate memory under 4GB for Dom0. >> >>There is no restriction that how much lowmem needs to be allocated for >>Dom0 ,so allocate lowmem as much as possible for Dom0. >> >>This patch does not affect 32-bit domain, because Variable "lowmem" is >>set to true at the beginning. If failed to allocate bank0 under 4GB, >>need to panic for 32-bit domain, because 32-bit domain requires bank0 >>be allocated under 4GB. >> >>For 64-bit domain, set "lowmem" to false, and continue allocating >>memory from above 4GB. >> >>Signed-off-by: Peng Fan <peng.fan@nxp.com> >>Cc: Stefano Stabellini <sstabellini@kernel.org> >>Cc: Julien Grall <julien.grall@arm.com> > >Reviewed-by: Julien Grall <julien.grall@arm.com> > >I am undecided whether this should be considered as a bug fix for Xen 4.8. >Are you aware of any ARM64 platform we currently support requiring allocation >of memory below 4GB? I have no idea about this (:, but I think this is a bug fix. Alought current supported platforms works well, users may choose 4.8 to support their new platform which has the limitation to access 64bit address. Regards, Peng. > >Regards, > >-- >Julien Grall
(CC Wei as release manager) On 10/11/16 08:30, Peng Fan wrote: > Hi Julien, Hi Peng, > On Tue, Nov 01, 2016 at 02:42:06PM +0000, Julien Grall wrote: >> Hi Peng, >> >> Sorry for the late answer. >> >> On 23/09/2016 03:55, Peng Fan wrote: >>> On AArch64 SoCs, some IPs may only have the capability to access >>> 32 bits address space. The physical memory assigned for Dom0 maybe >>> not in 4GB address space, then the IPs will not work properly. >>> So need to allocate memory under 4GB for Dom0. >>> >>> There is no restriction that how much lowmem needs to be allocated for >>> Dom0 ,so allocate lowmem as much as possible for Dom0. >>> >>> This patch does not affect 32-bit domain, because Variable "lowmem" is >>> set to true at the beginning. If failed to allocate bank0 under 4GB, >>> need to panic for 32-bit domain, because 32-bit domain requires bank0 >>> be allocated under 4GB. >>> >>> For 64-bit domain, set "lowmem" to false, and continue allocating >>> memory from above 4GB. >>> >>> Signed-off-by: Peng Fan <peng.fan@nxp.com> >>> Cc: Stefano Stabellini <sstabellini@kernel.org> >>> Cc: Julien Grall <julien.grall@arm.com> >> >> Reviewed-by: Julien Grall <julien.grall@arm.com> >> >> I am undecided whether this should be considered as a bug fix for Xen 4.8. >> Are you aware of any ARM64 platform we currently support requiring allocation >> of memory below 4GB? > > I have no idea about this (:, but I think this is a bug fix. Alought current > supported platforms works well, users may choose 4.8 to support their > new platform which has the limitation to access 64bit address. We are already late in the release process (rc5) for Xen 4.8. So we need to be careful when including a bug fix and evaluate the pros and cons. This patch is modifying the DOM0 memory layout for all 64-bit platforms. So it could potentially break one of the platform we officially support (see [1] for a non-exhaustive list). We don't have a test suite running automatically for ARM64 at the moment (it is been working on), this means that manual testing needs to be done. I am not aware of any platform, in the list we supports, having this issue so I prefer to stay on the safe side and defer this patch for Xen 4.9. If a user cares about Xen 4.8 for their platforms, then they could request the patch to be backported in Xen 4.8 after the release and after extensive testing in staging. Regards, [1] https://wiki.xenproject.org/wiki/Xen_ARM_with_Virtualization_Extensions#Hardware > > Regards, > Peng. > >> >> Regards, >> >> -- >> Julien Grall
On Thu, Nov 10, 2016 at 01:01:38PM +0000, Julien Grall wrote: >(CC Wei as release manager) > >On 10/11/16 08:30, Peng Fan wrote: >>Hi Julien, > >Hi Peng, > >>On Tue, Nov 01, 2016 at 02:42:06PM +0000, Julien Grall wrote: >>>Hi Peng, >>> >>>Sorry for the late answer. >>> >>>On 23/09/2016 03:55, Peng Fan wrote: >>>>On AArch64 SoCs, some IPs may only have the capability to access >>>>32 bits address space. The physical memory assigned for Dom0 maybe >>>>not in 4GB address space, then the IPs will not work properly. >>>>So need to allocate memory under 4GB for Dom0. >>>> >>>>There is no restriction that how much lowmem needs to be allocated for >>>>Dom0 ,so allocate lowmem as much as possible for Dom0. >>>> >>>>This patch does not affect 32-bit domain, because Variable "lowmem" is >>>>set to true at the beginning. If failed to allocate bank0 under 4GB, >>>>need to panic for 32-bit domain, because 32-bit domain requires bank0 >>>>be allocated under 4GB. >>>> >>>>For 64-bit domain, set "lowmem" to false, and continue allocating >>>>memory from above 4GB. >>>> >>>>Signed-off-by: Peng Fan <peng.fan@nxp.com> >>>>Cc: Stefano Stabellini <sstabellini@kernel.org> >>>>Cc: Julien Grall <julien.grall@arm.com> >>> >>>Reviewed-by: Julien Grall <julien.grall@arm.com> >>> >>>I am undecided whether this should be considered as a bug fix for Xen 4.8. >>>Are you aware of any ARM64 platform we currently support requiring allocation >>>of memory below 4GB? >> >>I have no idea about this (:, but I think this is a bug fix. Alought current >>supported platforms works well, users may choose 4.8 to support their >>new platform which has the limitation to access 64bit address. > >We are already late in the release process (rc5) for Xen 4.8. So we need to >be careful when including a bug fix and evaluate the pros and cons. > >This patch is modifying the DOM0 memory layout for all 64-bit platforms. So >it could potentially break one of the platform we officially support (see >[1] for a non-exhaustive list). We don't have a test suite running >automatically for ARM64 at the moment (it is been working on), this means >that manual testing needs to be done. I am not aware of any platform, in the >list we supports, having this issue so I prefer to stay on the safe side and >defer this patch for Xen 4.9. Ok. Defer it for 4.9 to avoid breaking any platforms. :) > >If a user cares about Xen 4.8 for their platforms, then they could request >the patch to be backported in Xen 4.8 after the release and after extensive >testing in staging. Yeah. Agree Thanks, Peng.
On Thu, Nov 10, 2016 at 01:01:38PM +0000, Julien Grall wrote: > (CC Wei as release manager) > > On 10/11/16 08:30, Peng Fan wrote: > >Hi Julien, > > Hi Peng, > > >On Tue, Nov 01, 2016 at 02:42:06PM +0000, Julien Grall wrote: > >>Hi Peng, > >> > >>Sorry for the late answer. > >> > >>On 23/09/2016 03:55, Peng Fan wrote: > >>>On AArch64 SoCs, some IPs may only have the capability to access > >>>32 bits address space. The physical memory assigned for Dom0 maybe > >>>not in 4GB address space, then the IPs will not work properly. > >>>So need to allocate memory under 4GB for Dom0. > >>> > >>>There is no restriction that how much lowmem needs to be allocated for > >>>Dom0 ,so allocate lowmem as much as possible for Dom0. > >>> > >>>This patch does not affect 32-bit domain, because Variable "lowmem" is > >>>set to true at the beginning. If failed to allocate bank0 under 4GB, > >>>need to panic for 32-bit domain, because 32-bit domain requires bank0 > >>>be allocated under 4GB. > >>> > >>>For 64-bit domain, set "lowmem" to false, and continue allocating > >>>memory from above 4GB. > >>> > >>>Signed-off-by: Peng Fan <peng.fan@nxp.com> > >>>Cc: Stefano Stabellini <sstabellini@kernel.org> > >>>Cc: Julien Grall <julien.grall@arm.com> > >> > >>Reviewed-by: Julien Grall <julien.grall@arm.com> > >> > >>I am undecided whether this should be considered as a bug fix for Xen 4.8. > >>Are you aware of any ARM64 platform we currently support requiring allocation > >>of memory below 4GB? > > > >I have no idea about this (:, but I think this is a bug fix. Alought current > >supported platforms works well, users may choose 4.8 to support their > >new platform which has the limitation to access 64bit address. > > We are already late in the release process (rc5) for Xen 4.8. So we need to > be careful when including a bug fix and evaluate the pros and cons. > > This patch is modifying the DOM0 memory layout for all 64-bit platforms. So > it could potentially break one of the platform we officially support (see > [1] for a non-exhaustive list). We don't have a test suite running > automatically for ARM64 at the moment (it is been working on), this means > that manual testing needs to be done. I am not aware of any platform, in the > list we supports, having this issue so I prefer to stay on the safe side and > defer this patch for Xen 4.9. > > If a user cares about Xen 4.8 for their platforms, then they could request > the patch to be backported in Xen 4.8 after the release and after extensive > testing in staging. > I agree with your reasoning. Wei.
Sorry for the late intrusion into this discussion. I would introduce my vision of the issues behind the 32 bits addressing DMA controllers in ARMv7/v8 SoCs. On AArch64 SoCs, some IPs may only have the capability to access > 32 bits address space. The physical memory assigned for Dom0 maybe > not in 4GB address space, then the IPs will not work properly. > So need to allocate memory under 4GB for Dom0. > IMHO that is a wrong approach. Unfortunately the problem is much bigger. Normally you would need to run guest domains as well. With at least PV Block and PV NET drivers. Due to the fact that PV drivers made in a way that DMA controller at last will work with DomU's pages, those pages should be from below 4GB. So any DomU running PV drivers should have some amount of pages from below 4GB. Moreover, the OS running in DomU should be knowing that only those pages are DMA-able, and that PV drivers should be working with DMA-able pages only: I.e. pages should be mapped correspondingly into different banks under and over 4GB. The approach, I believe is more suitable, is to specify explicitly an amount of RAM below 4GB and above 4GB for any domain. For dom0 through xen command line, for domU through domain configuration file. Such approach was implemented by GL. You can find preliminary patches over here: https://lists.xen.org/archives/html/xen-devel/2016-05/msg01785.html https://lists.xen.org/archives/html/xen-devel/2016-05/msg01786.html . I really hope GL will decide to tailor and upstream the feature. I am undecided whether this should be considered as a bug fix for Xen 4.8. > Are you aware of any ARM64 platform we currently support requiring > allocation of memory below 4GB? That is not only ARM64 problem. Any ARMv7/v8 based platform having no IOMMU supported (or owned) by XEN with 32-bit DMA controllers and RAM over 4GB will suffer from this problem. From living products: new J6 EVM with 4GB RAM, Salvator-X. Sincerely, Andrii Anisov.
Hello, On 11/11/16 11:35, Andrii Anisov wrote: > Sorry for the late intrusion into this discussion. I would introduce my > vision of the issues behind the 32 bits addressing DMA controllers in > ARMv7/v8 SoCs. > > On AArch64 SoCs, some IPs may only have the capability to access > 32 bits address space. The physical memory assigned for Dom0 maybe > not in 4GB address space, then the IPs will not work properly. > So need to allocate memory under 4GB for Dom0. > > IMHO that is a wrong approach. Unfortunately the problem is much bigger. > Normally you would need to run guest domains as well. With at least PV > Block and PV NET drivers. Due to the fact that PV drivers made in a way > that DMA controller at last will work with DomU's pages, those pages > should be from below 4GB. > So any DomU running PV drivers should have some amount of pages from > below 4GB. Moreover, the OS running in DomU should be knowing that only > those pages are DMA-able, and that PV drivers should be working with > DMA-able pages only: I.e. pages should be mapped correspondingly into > different banks under and over 4GB. From my understanding of what you say, the problem is not because domU is using memory above 4GB but the fact that the backend driver does not take the right decision (e.g using bounce buffer when required). The guest should be IPA agnostic and not care how the physical device is working when using PV drivers. So for me, this should be fixed in the DOM0 OS. Regards,
Hello Julien, Please see my comments below: > From my understanding of what you say, the problem is not because domU is using memory above 4GB but the fact that >the backend driver does not take the right decision Yep, the problem could be treated in such a way. > (e.g using bounce buffer when required). I suppose unacceptable performance drop for such kind of solution. An alternative here could be reverting of the FE-BE interaction scheme in a following way: BE side domain provides buffers and maps them to the FE side domain. Some time ago we estimated this approach as huge architecture change and enormous implementation efforts. Also it does answer to the next question: > The guest should be IPA agnostic and not care how the physical device is working when using PV drivers. So for me, > this should be fixed in the DOM0 OS. Do you consider driver domain guests? Sincerely, Andrii Anisov.
Sorry for a confusion. The sentence: > Also it does answer to the next question: should be typed as: > Also it does NOT answer to the next question: > > The guest should be IPA agnostic and not care how the physical device is working when using PV drivers. So for me, > > this should be fixed in the DOM0 OS. > Do you consider driver domain guests? Sincerely, Andrii Anisov. On Fri, Nov 11, 2016 at 4:24 PM, Andrii Anisov <andrii.anisov@gmail.com> wrote: > > Hello Julien, > > Please see my comments below: > > > From my understanding of what you say, the problem is not because domU is using memory above 4GB but the fact that >the backend driver does not take the right decision > > Yep, the problem could be treated in such a way. > > > (e.g using bounce buffer when required). > I suppose unacceptable performance drop for such kind of solution. > > An alternative here could be reverting of the FE-BE interaction scheme > in a following way: BE side domain provides buffers and maps them to > the FE side domain. Some time ago we estimated this approach as huge > architecture change and enormous implementation efforts. Also it does > answer to the next question: > > > The guest should be IPA agnostic and not care how the physical device is working when using PV drivers. So for me, > > this should be fixed in the DOM0 OS. > Do you consider driver domain guests? > > Sincerely, > Andrii Anisov.
On 11/11/16 14:24, Andrii Anisov wrote: > Hello Julien, > > Please see my comments below: > >> From my understanding of what you say, the problem is not because domU is using memory above 4GB but the fact that >the backend driver does not take the right decision > > Yep, the problem could be treated in such a way. > >> (e.g using bounce buffer when required). > I suppose unacceptable performance drop for such kind of solution. Could you define unacceptable performance drop? Have you tried to measure what would be the impact? > An alternative here could be reverting of the FE-BE interaction scheme > in a following way: BE side domain provides buffers and maps them to > the FE side domain. Some time ago we estimated this approach as huge > architecture change and enormous implementation efforts. You could also exhaust the memory of the backend domain. > Also it does > answer to the next question: > >> The guest should be IPA agnostic and not care how the physical device is working when using PV drivers. So for me, >> this should be fixed in the DOM0 OS. > Do you consider driver domain guests? The main point of driver domain is isolating a device/driver in a specific guest. For that you need an SMMU to secure the device, which would also solve the issue with 32-bit DMA-capable device. So why would you want to do driver domain without an SMMU present? Regards,
On Fri, 11 Nov 2016, Andrii Anisov wrote: > Hello Julien, > > Please see my comments below: > > > From my understanding of what you say, the problem is not because domU is using memory above 4GB but the fact that >the backend driver does not take the right decision > > Yep, the problem could be treated in such a way. That is the solution that was adopted on x86 to solve the same problem, see drivers/xen/swiotlb-xen.c in Linux. > > (e.g using bounce buffer when required). > I suppose unacceptable performance drop for such kind of solution. I know it can be bad, depending on the class of protocols. I think that if numbers were provided to demonstrate that bounce buffers (the swiotlb in Linux) are too slow for a given use case, we could consider the approach you suggested. However given that it increases complexity I would rather avoid it unless the performance benefits are major. > An alternative here could be reverting of the FE-BE interaction scheme > in a following way: BE side domain provides buffers and maps them to > the FE side domain. Some time ago we estimated this approach as huge > architecture change and enormous implementation efforts. Also it does > answer to the next question: The problem with this is not much the code changes but the risk of exhausting Dom0 memory. I think the approach you proposed previously, explicitly giving memory below 4G to DomUs, is better. > > The guest should be IPA agnostic and not care how the physical device is working when using PV drivers. So for me, > > this should be fixed in the DOM0 OS. > Do you consider driver domain guests? Yes, they are guests, but Dom0 is a guest too. Maybe a better question is: are driver domains unprivileged guests? Yes, they should be only privilege enough to have control over the device assigned to them. However without an SMMU there is no way to enforce security, because driver domains could use the device to DMA anything they want into Dom0 or Xen memory. In practice without an SMMU driver domains are just like Dom0.
On Fri, 11 Nov 2016, Julien Grall wrote: > > > The guest should be IPA agnostic and not care how the physical device is > > > working when using PV drivers. So for me, > > > this should be fixed in the DOM0 OS. > > Do you consider driver domain guests? > > The main point of driver domain is isolating a device/driver in a specific > guest. For that you need an SMMU to secure the device, which would also solve > the issue with 32-bit DMA-capable device. > > So why would you want to do driver domain without an SMMU present? There are many reasons: for example because you want Dom0 to be Linux and the storage driver domain to be FreeBSD. Or because you want the network driver domain to be QNX. Without an SMMU, driver domains are not about security anymore, they are about disaggregation and componentization, but they are still a valid choice.
> Without an SMMU, driver domains are not about security anymore, they are > about disaggregation and componentization That is our case. And the thing we can provide to customers on chips without SMMU. Sincerely, Andrii Anisov.
> There are many reasons: for example because you want Dom0 to be Linux > and the storage driver domain to be FreeBSD. Or because you want the > network driver domain to be QNX. What we estimate now is a thin Dom0 without any drivers running with ramdisk. All drivers would be moved to a special guest domain. Sincerely, Andrii Anisov.
> You could also exhaust the memory of the backend domain. > The problem with this is not much the code changes but the risk of > exhausting Dom0 memory. I think the approach you proposed previously, > explicitly giving memory below 4G to DomUs, is better. I see the point. Sincerely, Andrii Anisov.
> Could you define unacceptable performance drop? Have you tried to measure > what would be the impact? > I know it can be bad, depending on the class of protocols. I think that > if numbers were provided to demonstrate that bounce buffers (the swiotlb > in Linux) are too slow for a given use case Unfortunately I could not come up with exact requirements numbers. Introducing another memcpy (what bouncing buffer approach does) for block or network IO would not only reduce the operation performance but also increase the overall system load. All what we does for any of our PV driver solutions is avoiding data copying inside FE-BE pair in order to increase performance, reduce latency and system load. Sincerely, Andrii Anisov.
Hi Andrii, On 14/11/2016 03:11, Andrii Anisov wrote: >> There are many reasons: for example because you want Dom0 to be Linux >> and the storage driver domain to be FreeBSD. Or because you want the >> network driver domain to be QNX. > What we estimate now is a thin Dom0 without any drivers running with > ramdisk. All drivers would be moved to a special guest domain. You may want to give a look what has been done on x86 with the "Dedicated hardware domain". Another solution, is rather than moving the devices in a separate domain, you move the toolstack. The latter may cause less trouble on platform without SMMU. Regards,
On Mon, 14 Nov 2016, Andrii Anisov wrote: > > Could you define unacceptable performance drop? Have you tried to measure > > what would be the impact? > > > I know it can be bad, depending on the class of protocols. I think that > > if numbers were provided to demonstrate that bounce buffers (the swiotlb > > in Linux) are too slow for a given use case > > Unfortunately I could not come up with exact requirements numbers. > Introducing another memcpy (what bouncing buffer approach does) for > block or network IO would not only reduce the operation performance > but also increase the overall system load. > All what we does for any of our PV driver solutions is avoiding data > copying inside FE-BE pair in order to increase performance, reduce > latency and system load. I think it might be worth running those numbers: you might be surprised by how well a simple data copy protocol can perform, even on ARM. For example, take a look at PVCalls which is entirely based on data copies: http://marc.info/?l=xen-devel&m=147639616310487 I have already shown that it performs better than netfront/netback on x86 in this blog post: https://blog.xenproject.org/2016/08/30/pv-calls-a-new-paravirtualized-protocol-for-posix-syscalls/ I have just run the numbers on ARM64 (APM m400) and it is still much faster than netfront/netback. This is what I get by running iperf -c in a VM and iperf -s in Dom0: PVCalls Netfront/Netback -P 1 9.9 gbit/s 4.53 gbit/s -P 2 17.4 gbit/s 5.57 gbit/s -P 4 24.36 gbit/s 5.34 gbit/s PVCalls is still significantly faster than Netfront/Netback.
Julien, >> What we estimate now is a thin Dom0 without any drivers running with >> ramdisk. All drivers would be moved to a special guest domain. > > You may want to give a look what has been done on x86 with the "Dedicated > hardware domain". I have to look at the stuff. > Another solution, is rather than moving the devices in a separate domain, > you move the toolstack. I see the point. But there are number of different reasons to have a thin initial domain. F.e. system boot time optimization which is critical for applications we focus on. For cases as following: a thin initial domain would start special domain, i.e. responsible for CAN communication (with minimal needed devices set) prior one with rich devices set, and a domain actually running IVI (with PV drivers only) would be started finally. > The latter may cause less trouble on platform without SMMU. I hope we do switch to a IOMMU capable platform. But still have some flashbacks to IOMMU-less systems. Sincerely, Andrii Anisov. On Mon, Nov 14, 2016 at 10:30 PM, Julien Grall <julien.grall@arm.com> wrote: > Hi Andrii, > > On 14/11/2016 03:11, Andrii Anisov wrote: >>> >>> There are many reasons: for example because you want Dom0 to be Linux >>> and the storage driver domain to be FreeBSD. Or because you want the >>> network driver domain to be QNX. >> >> What we estimate now is a thin Dom0 without any drivers running with >> ramdisk. All drivers would be moved to a special guest domain. > > > You may want to give a look what has been done on x86 with the "Dedicated > hardware domain". > > Another solution, is rather than moving the devices in a separate domain, > you move the toolstack. The latter may cause less trouble on platform > without SMMU. > > Regards, > > -- > Julien Grall
> For example, take a look at PVCalls which is entirely based on data > copies: > > http://marc.info/?l=xen-devel&m=147639616310487 > > > I have already shown that it performs better than netfront/netback on > x86 in this blog post: > > https://blog.xenproject.org/2016/08/30/pv-calls-a-new-paravirtualized-protocol-for-posix-syscalls/ > > > I have just run the numbers on ARM64 (APM m400) and it is still much > faster than netfront/netback. This is what I get by running iperf -c in > a VM and iperf -s in Dom0: > > PVCalls Netfront/Netback > -P 1 9.9 gbit/s 4.53 gbit/s > -P 2 17.4 gbit/s 5.57 gbit/s > -P 4 24.36 gbit/s 5.34 gbit/s > > PVCalls is still significantly faster than Netfront/Netback. This seems to be not a really fair comparison. And does not reflect performance impact of the data copying itself. Among all, our team is working on PV DRM implementation now. I guess the first implementation would have a data copying, then we will introduce a zero-copy. So it should be a good example to collect and share impact numbers. In embedded applications area, we are currently focused on, acceptable performance drop, f.e. for io operations, is estimated as 3-5% comparing to bare-metal system. Anyway thank you for your comments, suggestions and examples. I've got the point that we have to have solid reasoning baked with pack of numbers to get something specific to us accepted by community. Sincerely, Andrii Anisov.
On Wed, 16 Nov 2016, Andrii Anisov wrote: > > For example, take a look at PVCalls which is entirely based on data > > copies: > > > > http://marc.info/?l=xen-devel&m=147639616310487 > > > > > > I have already shown that it performs better than netfront/netback on > > x86 in this blog post: > > > > https://blog.xenproject.org/2016/08/30/pv-calls-a-new-paravirtualized-protocol-for-posix-syscalls/ > > > > > > I have just run the numbers on ARM64 (APM m400) and it is still much > > faster than netfront/netback. This is what I get by running iperf -c in > > a VM and iperf -s in Dom0: > > > > PVCalls Netfront/Netback > > -P 1 9.9 gbit/s 4.53 gbit/s > > -P 2 17.4 gbit/s 5.57 gbit/s > > -P 4 24.36 gbit/s 5.34 gbit/s > > > > PVCalls is still significantly faster than Netfront/Netback. > This seems to be not a really fair comparison. And does not reflect > performance impact of the data copying itself. Why it is not a fair comparison? Because the design is different or because of the settings? I am happy to adjust benchmarking settings to make the comparison fairer. > Among all, our team is working on PV DRM implementation now. I guess > the first implementation would have a data copying, then we will > introduce a zero-copy. So it should be a good example to collect and > share impact numbers. Looking forward to them. > In embedded applications area, we are currently focused on, acceptable > performance drop, f.e. for io operations, is estimated as 3-5% > comparing to bare-metal system. That's very interesting info and certainly a difficult mark to reach. > Anyway thank you for your comments, suggestions and examples. > I've got the point that we have to have solid reasoning baked with > pack of numbers to get something specific to us accepted by community. Cheers, Stefano
On Thu, 17 Nov 2016, Stefano Stabellini wrote: > > > I have just run the numbers on ARM64 (APM m400) and it is still much > > > faster than netfront/netback. This is what I get by running iperf -c in > > > a VM and iperf -s in Dom0: > > > > > > PVCalls Netfront/Netback > > > -P 1 9.9 gbit/s 4.53 gbit/s > > > -P 2 17.4 gbit/s 5.57 gbit/s > > > -P 4 24.36 gbit/s 5.34 gbit/s > > > > > > PVCalls is still significantly faster than Netfront/Netback. > > This seems to be not a really fair comparison. And does not reflect > > performance impact of the data copying itself. > > Why it is not a fair comparison? Because the design is different or > because of the settings? I am happy to adjust benchmarking settings to > make the comparison fairer. Actually it turns out that Netfront/Netback use another form of copy: grant copies. So you are right that this comparison doesn't really reflect copy vs. mapping performance.
> Why it is not a fair comparison? Because the design is different or > because of the settings? Because the design difference. It's not about memcpy vs mapping within the same stack (design). And you measured interdomain communication only, not involving hardware interfaces. > I am happy to adjust benchmarking settings to make the comparison fairer. BTW, what about communication between DomU and the external network through the hw interface? Sincerely, Andrii Anisov.
On Mon, 21 Nov 2016, Andrii Anisov wrote: > > Why it is not a fair comparison? Because the design is different or > > because of the settings? > Because the design difference. > It's not about memcpy vs mapping within the same stack (design). And > you measured interdomain communication only, not involving hardware > interfaces. > > > I am happy to adjust benchmarking settings to make the comparison fairer. > BTW, what about communication between DomU and the external network > through the hw interface? Unfortunately I don't have the HW to do that test at the moment, but I would be interested in seeing the numbers.
diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index 35ab08d..6b5ac8d 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -195,9 +195,9 @@ fail: * bank. Partly this is just easier for us to deal with, but also * the ramdisk and DTB must be placed within a certain proximity of * the kernel within RAM. - * 3. For 32-bit dom0 we want to place as much of the RAM as we - * reasonably can below 4GB, so that it can be used by non-LPAE - * enabled kernels. + * 3. For dom0 we want to place as much of the RAM as we reasonably can + * below 4GB, so that it can be used by non-LPAE enabled kernels (32-bit) + * or when a device assigned to dom0 can only do 32-bit DMA access. * 4. For 32-bit dom0 the kernel must be located below 4GB. * 5. We want to have a few largers banks rather than many smaller ones. * @@ -230,7 +230,8 @@ fail: * we give up. * * For 32-bit domain we require that the initial allocation for the - * first bank is under 4G. Then for the subsequent allocations we + * first bank is under 4G. For 64-bit domain, the first bank is preferred + * to be allocated under 4G. Then for the subsequent allocations we * initially allocate memory only from below 4GB. Once that runs out * (as described above) we allow higher allocations and continue until * that runs out (or we have allocated sufficient dom0 memory). @@ -244,7 +245,7 @@ static void allocate_memory(struct domain *d, struct kernel_info *kinfo) unsigned int order = get_11_allocation_size(kinfo->unassigned_mem); int i; - bool_t lowmem = is_32bit_domain(d); + bool_t lowmem = true; unsigned int bits; /* @@ -269,20 +270,30 @@ static void allocate_memory(struct domain *d, struct kernel_info *kinfo) { pg = alloc_domheap_pages(d, order, MEMF_bits(bits)); if ( pg != NULL ) + { + if ( !insert_11_bank(d, kinfo, pg, order) ) + BUG(); /* Cannot fail for first bank */ + goto got_bank0; + } } order--; } - panic("Unable to allocate first memory bank"); - - got_bank0: + /* Failed to allocate bank0 under 4GB */ + if ( is_32bit_domain(d) ) + panic("Unable to allocate first memory bank."); - if ( !insert_11_bank(d, kinfo, pg, order) ) - BUG(); /* Cannot fail for first bank */ + /* Try to allocate memory from above 4GB */ + printk(XENLOG_INFO "No bank has been allocated below 4GB.\n"); + lowmem = false; - /* Now allocate more memory and fill in additional banks */ + got_bank0: + /* + * If we failed to allocate bank0 under 4GB, continue allocating + * memory from above 4GB and fill in banks. + */ order = get_11_allocation_size(kinfo->unassigned_mem); while ( kinfo->unassigned_mem && kinfo->mem.nr_banks < NR_MEM_BANKS ) {
On AArch64 SoCs, some IPs may only have the capability to access 32 bits address space. The physical memory assigned for Dom0 maybe not in 4GB address space, then the IPs will not work properly. So need to allocate memory under 4GB for Dom0. There is no restriction that how much lowmem needs to be allocated for Dom0 ,so allocate lowmem as much as possible for Dom0. This patch does not affect 32-bit domain, because Variable "lowmem" is set to true at the beginning. If failed to allocate bank0 under 4GB, need to panic for 32-bit domain, because 32-bit domain requires bank0 be allocated under 4GB. For 64-bit domain, set "lowmem" to false, and continue allocating memory from above 4GB. Signed-off-by: Peng Fan <peng.fan@nxp.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Julien Grall <julien.grall@arm.com> --- This patch is to resolve the issue mentioned in https://lists.xen.org/archives/html/xen-devel/2016-09/msg00235.html Tested results: (XEN) Allocating 1:1 mappings totalling 2048MB for dom0: (XEN) BANK[0] 0x00000088000000-0x000000f8000000 (1792MB) (XEN) BANK[1] 0x000009e0000000-0x000009f0000000 (256MB) 1792M allocated in 4GB address space. V4: Address comments in V3: https://lists.xen.org/archives/html/xen-devel/2016-09/msg02499.html Drop uneccessary check when failed to allocate memory under 4GB Refine comments according to Julien's suggestion in V3. Keep "bits <= (lowmem ? 32 : PADDR_BITS)", but not changed to "bits <= 32" V3: Add more commit log Add more comments Add back panic if failed to allocate bank0 under 4GB for 32-bit domain. V2: Remove the bootargs dom0_lowmem introduced in V1. Following "https://lists.xenproject.org/archives/html/xen-devel/2016-09/msg01459.html" to allocate as much as possible lowmem. xen/arch/arm/domain_build.c | 33 ++++++++++++++++++++++----------- 1 file changed, 22 insertions(+), 11 deletions(-)