Message ID | 1585844328-30654-1-git-send-email-igor.druzhinin@citrix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | hvmloader: probe memory below 4G before allocation for OVMF | expand |
On 02.04.2020 18:18, Igor Druzhinin wrote: > The area just below 4G where OVMF image is originally relocated is not > necessarily a hole - it might contain pages preallocated by device model > or the toolstack. By unconditionally populating on top of this memory > the original pages are getting lost while still potentially foreign mapped > in Dom0. When there are pre-allocated pages - have they been orphaned? If so, shouldn't whoever populated them unpopulate rather than orphaning them? Or if not - how is the re-use you do safe? > --- a/tools/firmware/hvmloader/util.c > +++ b/tools/firmware/hvmloader/util.c > @@ -398,6 +398,20 @@ int get_mem_mapping_layout(struct e820entry entries[], uint32_t *max_entries) > return rc; > } > > +bool mem_probe_ram(xen_pfn_t mfn) > +{ > + uint32_t tmp, magic = 0xdeadbeef; > + volatile uint32_t *addr = (volatile uint32_t *)(mfn << PAGE_SHIFT); > + > + tmp = *addr; > + *addr = magic; > + if ( *addr != magic ) > + return 0; > + > + *addr = tmp; > + return 1; > +} This looks to probe r/o behavior. If there was a ROM page pre-populated, wouldn't it then be equally lost once you populate RAM over it? And what if this is MMIO, i.e. writable but potentially with side effects? Whether, as you suggest as an alternative, moving populating of this space to the tool stack is feasible I don't know. If it was, I would wonder though why it wasn't done like this in the first place. Jan
On 03/04/2020 14:53, Jan Beulich wrote: > On 02.04.2020 18:18, Igor Druzhinin wrote: >> The area just below 4G where OVMF image is originally relocated is not >> necessarily a hole - it might contain pages preallocated by device model >> or the toolstack. By unconditionally populating on top of this memory >> the original pages are getting lost while still potentially foreign mapped >> in Dom0. > > When there are pre-allocated pages - have they been orphaned? If > so, shouldn't whoever populated them unpopulate rather than > orphaning them? Or if not - how is the re-use you do safe? There is no signal to the device model when that happens - it has no idea when the memory it populated and hold a mapping to becomes suddenly unmapped in the guest. Re-use is safe as the memory prepopulated before OVMF starts has not been put in use yet until after it's finished initializing. >> --- a/tools/firmware/hvmloader/util.c >> +++ b/tools/firmware/hvmloader/util.c >> @@ -398,6 +398,20 @@ int get_mem_mapping_layout(struct e820entry entries[], uint32_t *max_entries) >> return rc; >> } >> >> +bool mem_probe_ram(xen_pfn_t mfn) >> +{ >> + uint32_t tmp, magic = 0xdeadbeef; >> + volatile uint32_t *addr = (volatile uint32_t *)(mfn << PAGE_SHIFT); >> + >> + tmp = *addr; >> + *addr = magic; >> + if ( *addr != magic ) >> + return 0; >> + >> + *addr = tmp; >> + return 1; >> +} > > This looks to probe r/o behavior. If there was a ROM page pre-populated, > wouldn't it then be equally lost once you populate RAM over it? And what > if this is MMIO, i.e. writable but potentially with side effects? I assume if the pages behind it are not r/w there is no other way to avoid crashing immediately except go and repopulate on top. MMIO is a problematic corner case I expect device model would try to avoid. > Whether, as you suggest as an alternative, moving populating of this > space to the tool stack is feasible I don't know. If it was, I would > wonder though why it wasn't done like this in the first place. I expect one complication is to know the type of firmware at the moment domain is constructed to make sure area is populated for OVMF only. Igor
On 03.04.2020 16:26, Igor Druzhinin wrote: > On 03/04/2020 14:53, Jan Beulich wrote: >> On 02.04.2020 18:18, Igor Druzhinin wrote: >>> The area just below 4G where OVMF image is originally relocated is not >>> necessarily a hole - it might contain pages preallocated by device model >>> or the toolstack. By unconditionally populating on top of this memory >>> the original pages are getting lost while still potentially foreign mapped >>> in Dom0. >> >> When there are pre-allocated pages - have they been orphaned? If >> so, shouldn't whoever populated them unpopulate rather than >> orphaning them? Or if not - how is the re-use you do safe? > > There is no signal to the device model when that happens - it has no idea when the > memory it populated and hold a mapping to becomes suddenly unmapped in the guest. > Re-use is safe as the memory prepopulated before OVMF starts has not been put > in use yet until after it's finished initializing. I guess I'm lacking some details here to fully understand what you're saying: What is it that may pre-populate this range, and for what purpose? Jan
On 03/04/2020 14:53, Jan Beulich wrote: > On 02.04.2020 18:18, Igor Druzhinin wrote: >> The area just below 4G where OVMF image is originally relocated is not >> necessarily a hole - it might contain pages preallocated by device model >> or the toolstack. By unconditionally populating on top of this memory >> the original pages are getting lost while still potentially foreign mapped >> in Dom0. > When there are pre-allocated pages - have they been orphaned? If > so, shouldn't whoever populated them unpopulate rather than > orphaning them? Or if not - how is the re-use you do safe? So this is a mess. OVMF is linked to run at a fixed address suitable for native hardware, which is in the SPI ROM immediately below the 4G boundary (this is correct). We also put the framebuffer there (this is not correct). This was fine for RomBIOS which is located under the 1M boundary. It is also fine for a fully-emulated VGA device in Qemu, because the the framebuffer if moved (re-set up actually) when the virtual BAR is moved, but with a real GPU (SR-IOV in this case), there is no logic to play games. The problem is entirely caused by the framebuffer in Xen not being like any real system. The framebuffer isn't actually in a BAR, and also doesn't manifest itself in the way that graphics-stolen-ram normally does, either. ~Andrew
On 03/04/2020 15:39, Andrew Cooper wrote: > On 03/04/2020 14:53, Jan Beulich wrote: >> On 02.04.2020 18:18, Igor Druzhinin wrote: >>> The area just below 4G where OVMF image is originally relocated is not >>> necessarily a hole - it might contain pages preallocated by device model >>> or the toolstack. By unconditionally populating on top of this memory >>> the original pages are getting lost while still potentially foreign mapped >>> in Dom0. >> When there are pre-allocated pages - have they been orphaned? If >> so, shouldn't whoever populated them unpopulate rather than >> orphaning them? Or if not - how is the re-use you do safe? > > So this is a mess. > > OVMF is linked to run at a fixed address suitable for native hardware, > which is in the SPI ROM immediately below the 4G boundary (this is > correct). We also put the framebuffer there (this is not correct). > > This was fine for RomBIOS which is located under the 1M boundary. > > It is also fine for a fully-emulated VGA device in Qemu, because the the > framebuffer if moved (re-set up actually) when the virtual BAR is moved, > but with a real GPU (SR-IOV in this case), there is no logic to play games. > > The problem is entirely caused by the framebuffer in Xen not being like > any real system. The framebuffer isn't actually in a BAR, and also > doesn't manifest itself in the way that graphics-stolen-ram normally > does, either. Adding to what Andrew said: There multiple technical complications that caused this mess. One of them is that there is no unfortunately a better place for the framebuffer to be located initially. Second, SR-IOV device is real and adding a virtual BAR to it is also complicated (due to compatibility reasons) and NVIDIA decided to avoid that. Igor
On 03.04.2020 16:47, Igor Druzhinin wrote: > On 03/04/2020 15:39, Andrew Cooper wrote: >> On 03/04/2020 14:53, Jan Beulich wrote: >>> On 02.04.2020 18:18, Igor Druzhinin wrote: >>>> The area just below 4G where OVMF image is originally relocated is not >>>> necessarily a hole - it might contain pages preallocated by device model >>>> or the toolstack. By unconditionally populating on top of this memory >>>> the original pages are getting lost while still potentially foreign mapped >>>> in Dom0. >>> When there are pre-allocated pages - have they been orphaned? If >>> so, shouldn't whoever populated them unpopulate rather than >>> orphaning them? Or if not - how is the re-use you do safe? >> >> So this is a mess. >> >> OVMF is linked to run at a fixed address suitable for native hardware, >> which is in the SPI ROM immediately below the 4G boundary (this is >> correct). We also put the framebuffer there (this is not correct). >> >> This was fine for RomBIOS which is located under the 1M boundary. >> >> It is also fine for a fully-emulated VGA device in Qemu, because the the >> framebuffer if moved (re-set up actually) when the virtual BAR is moved, >> but with a real GPU (SR-IOV in this case), there is no logic to play games. So are you saying OVMF starts out appearing to run in VRAM then in the OVMF case, until the frame buffer gets moved? If so, with the logic added by this patch, how would both places end (old VRAM address, where OVMF lives, and new VRAM address) get backed by distinct pages? Wouldn't the avoided re-populate result in the same page having two uses? Or alternatively there being a hole in OVMF space, which would be a problem if this was backing runtime memory? >> The problem is entirely caused by the framebuffer in Xen not being like >> any real system. The framebuffer isn't actually in a BAR, and also >> doesn't manifest itself in the way that graphics-stolen-ram normally >> does, either. > > Adding to what Andrew said: > > There multiple technical complications that caused this mess. > One of them is that there is no unfortunately a better place for the > framebuffer to be located initially. Second, SR-IOV device > is real and adding a virtual BAR to it is also complicated (due to > compatibility reasons) and NVIDIA decided to avoid that. In which case I wonder - aren't you ending up with the MMIO case that I had mentioned, and that you said is difficult to deal with? Jan
On 03/04/2020 16:05, Jan Beulich wrote: > On 03.04.2020 16:47, Igor Druzhinin wrote: >> On 03/04/2020 15:39, Andrew Cooper wrote: >>> On 03/04/2020 14:53, Jan Beulich wrote: >>>> On 02.04.2020 18:18, Igor Druzhinin wrote: >>>>> The area just below 4G where OVMF image is originally relocated is not >>>>> necessarily a hole - it might contain pages preallocated by device model >>>>> or the toolstack. By unconditionally populating on top of this memory >>>>> the original pages are getting lost while still potentially foreign mapped >>>>> in Dom0. >>>> When there are pre-allocated pages - have they been orphaned? If >>>> so, shouldn't whoever populated them unpopulate rather than >>>> orphaning them? Or if not - how is the re-use you do safe? >>> >>> So this is a mess. >>> >>> OVMF is linked to run at a fixed address suitable for native hardware, >>> which is in the SPI ROM immediately below the 4G boundary (this is >>> correct). We also put the framebuffer there (this is not correct). >>> >>> This was fine for RomBIOS which is located under the 1M boundary. >>> >>> It is also fine for a fully-emulated VGA device in Qemu, because the the >>> framebuffer if moved (re-set up actually) when the virtual BAR is moved, >>> but with a real GPU (SR-IOV in this case), there is no logic to play games. > > So are you saying OVMF starts out appearing to run in VRAM then > in the OVMF case, until the frame buffer gets moved? If so, > with the logic added by this patch, how would both places end > (old VRAM address, where OVMF lives, and new VRAM address) get > backed by distinct pages? Wouldn't the avoided re-populate > result in the same page having two uses? Or alternatively there > being a hole in OVMF space, which would be a problem if this > was backing runtime memory? In normal case (not SR-IOV) VRAM gets evacuated (by PCI logic) before hvmloader overwrites it. So the issue is avoided. But for SR-IOV VRAM stays so VRAM area is temporary used to hold OVMF image - until decompression is complete. With this patch VRAM pages would be used for that purpose instead new ones. >>> The problem is entirely caused by the framebuffer in Xen not being like >>> any real system. The framebuffer isn't actually in a BAR, and also >>> doesn't manifest itself in the way that graphics-stolen-ram normally >>> does, either. >> >> Adding to what Andrew said: >> >> There multiple technical complications that caused this mess. >> One of them is that there is no unfortunately a better place for the >> framebuffer to be located initially. Second, SR-IOV device >> is real and adding a virtual BAR to it is also complicated (due to >> compatibility reasons) and NVIDIA decided to avoid that. > > In which case I wonder - aren't you ending up with the MMIO case > that I had mentioned, and that you said is difficult to deal with? No, it's VRAM area (normal RAM pages) - not MMIO. Igor
On 03.04.2020 17:17, Igor Druzhinin wrote: > On 03/04/2020 16:05, Jan Beulich wrote: >> On 03.04.2020 16:47, Igor Druzhinin wrote: >>> There multiple technical complications that caused this mess. >>> One of them is that there is no unfortunately a better place for the >>> framebuffer to be located initially. Second, SR-IOV device >>> is real and adding a virtual BAR to it is also complicated (due to >>> compatibility reasons) and NVIDIA decided to avoid that. >> >> In which case I wonder - aren't you ending up with the MMIO case >> that I had mentioned, and that you said is difficult to deal with? > > No, it's VRAM area (normal RAM pages) - not MMIO. Well, VRAM is still MMIO from the CPU's perspective, just without any side effects. But if it was another device that was passed through, couldn't its MMIO similarly end up in that area? Jan
On 03/04/2020 16:28, Jan Beulich wrote: > On 03.04.2020 17:17, Igor Druzhinin wrote: >> On 03/04/2020 16:05, Jan Beulich wrote: >>> On 03.04.2020 16:47, Igor Druzhinin wrote: >>>> There multiple technical complications that caused this mess. >>>> One of them is that there is no unfortunately a better place for the >>>> framebuffer to be located initially. Second, SR-IOV device >>>> is real and adding a virtual BAR to it is also complicated (due to >>>> compatibility reasons) and NVIDIA decided to avoid that. >>> >>> In which case I wonder - aren't you ending up with the MMIO case >>> that I had mentioned, and that you said is difficult to deal with? >> >> No, it's VRAM area (normal RAM pages) - not MMIO. > > Well, VRAM is still MMIO from the CPU's perspective, just without > any side effects. But if it was another device that was passed > through, couldn't its MMIO similarly end up in that area? As Andrew said, Xen VRAM is just normal RAM. It's originally populated in this area for the lack of a better place (at least now). It's special and has nothing to do with the device passed through using conventional PCI mechanisms which BARs will only occupy MMIO hole. Igor
> -----Original Message----- > From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of Igor Druzhinin > Sent: 03 April 2020 16:36 > To: Jan Beulich <jbeulich@suse.com> > Cc: Andrew Cooper <andrew.cooper3@citrix.com>; roger.pau@citrix.com; ian.jackson@eu.citrix.com; > wl@xen.org; xen-devel@lists.xenproject.org > Subject: Re: [PATCH] hvmloader: probe memory below 4G before allocation for OVMF > > On 03/04/2020 16:28, Jan Beulich wrote: > > On 03.04.2020 17:17, Igor Druzhinin wrote: > >> On 03/04/2020 16:05, Jan Beulich wrote: > >>> On 03.04.2020 16:47, Igor Druzhinin wrote: > >>>> There multiple technical complications that caused this mess. > >>>> One of them is that there is no unfortunately a better place for the > >>>> framebuffer to be located initially. Second, SR-IOV device > >>>> is real and adding a virtual BAR to it is also complicated (due to > >>>> compatibility reasons) and NVIDIA decided to avoid that. > >>> > >>> In which case I wonder - aren't you ending up with the MMIO case > >>> that I had mentioned, and that you said is difficult to deal with? > >> > >> No, it's VRAM area (normal RAM pages) - not MMIO. > > > > Well, VRAM is still MMIO from the CPU's perspective, just without > > any side effects. But if it was another device that was passed > > through, couldn't its MMIO similarly end up in that area? > > As Andrew said, Xen VRAM is just normal RAM. It's originally > populated in this area for the lack of a better place (at least now). > It's special and has nothing to do with the device passed through > using conventional PCI mechanisms which BARs will only occupy MMIO hole. > I assume Jan's point is that the guest doesn't access it as if it is normal RAM. It's only accessed directly if it is present in a PCI BAR, otherwise it is accessed indirectly (via QEMU) in response to accesses to the VGA aperture. Paul
On 02/04/2020 17:18, Igor Druzhinin wrote: > The area just below 4G where OVMF image is originally relocated is not > necessarily a hole - it might contain pages preallocated by device model > or the toolstack. By unconditionally populating on top of this memory > the original pages are getting lost while still potentially foreign mapped > in Dom0. > > Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com> > --- > That doesn't seem necessary for at least upstream toolstack now. > Alternative might be - to move population of this area to the toolstack > where there is more control over memory layout. > --- Thanks for the discussion, please ignore this patch. We found a better way how deal with our problem locally. We will introduce a reserved region within MMIO hole that is managed similarly to stolen memory. Igor
diff --git a/tools/firmware/hvmloader/ovmf.c b/tools/firmware/hvmloader/ovmf.c index 23610a0..70d5f70 100644 --- a/tools/firmware/hvmloader/ovmf.c +++ b/tools/firmware/hvmloader/ovmf.c @@ -106,7 +106,8 @@ static void ovmf_load(const struct bios_config *config, { mfn = (uint32_t) (addr >> PAGE_SHIFT); addr += PAGE_SIZE; - mem_hole_populate_ram(mfn, 1); + if ( !mem_probe_ram(mfn) ) + mem_hole_populate_ram(mfn, 1); } /* Check that source and destination does not overlaps. */ diff --git a/tools/firmware/hvmloader/util.c b/tools/firmware/hvmloader/util.c index 0c3f2d2..724cea0 100644 --- a/tools/firmware/hvmloader/util.c +++ b/tools/firmware/hvmloader/util.c @@ -398,6 +398,20 @@ int get_mem_mapping_layout(struct e820entry entries[], uint32_t *max_entries) return rc; } +bool mem_probe_ram(xen_pfn_t mfn) +{ + uint32_t tmp, magic = 0xdeadbeef; + volatile uint32_t *addr = (volatile uint32_t *)(mfn << PAGE_SHIFT); + + tmp = *addr; + *addr = magic; + if ( *addr != magic ) + return 0; + + *addr = tmp; + return 1; +} + void mem_hole_populate_ram(xen_pfn_t mfn, uint32_t nr_mfns) { static int over_allocated; diff --git a/tools/firmware/hvmloader/util.h b/tools/firmware/hvmloader/util.h index 7bca641..00a7c13 100644 --- a/tools/firmware/hvmloader/util.h +++ b/tools/firmware/hvmloader/util.h @@ -194,6 +194,9 @@ int vprintf(const char *fmt, va_list ap); /* Buffer output */ int snprintf(char *buf, size_t size, const char *fmt, ...) __attribute__ ((format (printf, 3, 4))); +/* Probe whether a page is populated with RAM. */ +bool mem_probe_ram(xen_pfn_t mfn); + /* Populate specified memory hole with RAM. */ void mem_hole_populate_ram(xen_pfn_t mfn, uint32_t nr_mfns);
The area just below 4G where OVMF image is originally relocated is not necessarily a hole - it might contain pages preallocated by device model or the toolstack. By unconditionally populating on top of this memory the original pages are getting lost while still potentially foreign mapped in Dom0. Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com> --- That doesn't seem necessary for at least upstream toolstack now. Alternative might be - to move population of this area to the toolstack where there is more control over memory layout. --- tools/firmware/hvmloader/ovmf.c | 3 ++- tools/firmware/hvmloader/util.c | 14 ++++++++++++++ tools/firmware/hvmloader/util.h | 3 +++ 3 files changed, 19 insertions(+), 1 deletion(-)