Message ID | 1415395125-18926-1-git-send-email-agraf@suse.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 07.11.14 22:18, Alexander Graf wrote: > Memory slots have to be page aligned to get entered into KVM. There > is existing logic that tries to ensure that we pad memory slots that > are not page aligned to the biggest region that would still fit in the > alignment requirements. > > Unfortunately, that logic is broken. It tries to calculate the start > offset based on the region size. > > Fix up the logic to do the thing it was intended to do and document it > properly in the comment above it. > > With this patch applied, I can successfully run an e500 guest with more > than 3GB RAM (at which point RAM starts overlapping subpage memory regions). > > Cc: qemu-stable@nongnu.org > Signed-off-by: Alexander Graf <agraf@suse.de> If everyone agrees that this patch does indeed do what the code is intended to do (I think it's quite correct, to be 100% right it should use getpagesize() rather than TARGET_PAGE_SIZE), this should go into 2.2 still. Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 7 Nov 2014 22:18:45 +0100 Alexander Graf <agraf@suse.de> wrote: > Memory slots have to be page aligned to get entered into KVM. There > is existing logic that tries to ensure that we pad memory slots that > are not page aligned to the biggest region that would still fit in the > alignment requirements. > > Unfortunately, that logic is broken. It tries to calculate the start > offset based on the region size. > > Fix up the logic to do the thing it was intended to do and document it > properly in the comment above it. > > With this patch applied, I can successfully run an e500 guest with more > than 3GB RAM (at which point RAM starts overlapping subpage memory regions). > > Cc: qemu-stable@nongnu.org > Signed-off-by: Alexander Graf <agraf@suse.de> > --- > kvm-all.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/kvm-all.c b/kvm-all.c > index 44a5e72..596e7ce 100644 > --- a/kvm-all.c > +++ b/kvm-all.c > @@ -634,8 +634,10 @@ static void kvm_set_phys_mem(MemoryRegionSection *section, bool add) > unsigned delta; > > /* kvm works in page size chunks, but the function may be called > - with sub-page size and unaligned start address. */ > - delta = TARGET_PAGE_ALIGN(size) - size; > + with sub-page size and unaligned start address. Pad the start > + address to next and truncate size to previous page boundary. */ I'm a bit confused how it works at all. Lets assume that there is no mapped pages that include start_addr, then if start_addr were padded to next page, kvm would map it from there but the rest of QEMU would still use unaligned start_addr for MemoryRegion that isn't even mapped. It would seem that instead of padding up to the next page, start_addr should be moved to the start of the page that includes it to make page with original start_addr available to guest. > + delta = (TARGET_PAGE_SIZE - (start_addr & ~TARGET_PAGE_MASK)); > + delta &= ~TARGET_PAGE_MASK; > if (delta > size) { > return; > } -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10.11.14 13:31, Igor Mammedov wrote: > On Fri, 7 Nov 2014 22:18:45 +0100 > Alexander Graf <agraf@suse.de> wrote: > >> Memory slots have to be page aligned to get entered into KVM. There >> is existing logic that tries to ensure that we pad memory slots that >> are not page aligned to the biggest region that would still fit in the >> alignment requirements. >> >> Unfortunately, that logic is broken. It tries to calculate the start >> offset based on the region size. >> >> Fix up the logic to do the thing it was intended to do and document it >> properly in the comment above it. >> >> With this patch applied, I can successfully run an e500 guest with more >> than 3GB RAM (at which point RAM starts overlapping subpage memory regions). >> >> Cc: qemu-stable@nongnu.org >> Signed-off-by: Alexander Graf <agraf@suse.de> >> --- >> kvm-all.c | 6 ++++-- >> 1 file changed, 4 insertions(+), 2 deletions(-) >> >> diff --git a/kvm-all.c b/kvm-all.c >> index 44a5e72..596e7ce 100644 >> --- a/kvm-all.c >> +++ b/kvm-all.c >> @@ -634,8 +634,10 @@ static void kvm_set_phys_mem(MemoryRegionSection *section, bool add) >> unsigned delta; >> >> /* kvm works in page size chunks, but the function may be called >> - with sub-page size and unaligned start address. */ >> - delta = TARGET_PAGE_ALIGN(size) - size; >> + with sub-page size and unaligned start address. Pad the start >> + address to next and truncate size to previous page boundary. */ > I'm a bit confused how it works at all. > Lets assume that there is no mapped pages that include start_addr, > then if start_addr were padded to next page, kvm would map it from there > but the rest of QEMU would still use unaligned start_addr for MemoryRegion > that isn't even mapped. Sorry, I don't understand this paragraph. Memory slots in general are accelerations for memory access - for MMIO (RAM is usually aligned), KVM can always exit to QEMU and just do a manual MMIO exit. > It would seem that instead of padding up to the next page, start_addr > should be moved to the start of the page that includes it to make page > with original start_addr available to guest. No, because in that case you would map something as RAM that really isn't RAM. Imagine you have the following memory layout: 0x1000 page size 1) 0x00000 - 0x10000 RAM 2) 0x10000 - 0x10100 MMIO 3) 0x10100 - 0x20000 RAM Then you want to map 1) as memory slot and 4) from 0x11000 onwards as memory slot. You can't map the page from 0x10000 - 0x11000 as memory slot, because part of it is MMIO. Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/11/2014 14:16, Alexander Graf wrote: > No, because in that case you would map something as RAM that really > isn't RAM. > > Imagine you have the following memory layout: > > 0x1000 page size > > 1) 0x00000 - 0x10000 RAM > 2) 0x10000 - 0x10100 MMIO > 3) 0x10100 - 0x20000 RAM > > Then you want to map 1) as memory slot and 4) from 0x11000 onwards as > memory slot. > > You can't map the page from 0x10000 - 0x11000 as memory slot, because > part of it is MMIO. Right. The partial RAM page remains marked as MMIO as far as KVM is concerned, so accesses are slow and you cannot run code from it. However, it is fundamental that MMIO areas are not marked as RAM. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10 November 2014 13:16, Alexander Graf <agraf@suse.de> wrote: > Sorry, I don't understand this paragraph. Memory slots in general are > accelerations for memory access - for MMIO (RAM is usually aligned), KVM > can always exit to QEMU and just do a manual MMIO exit. ...you're a bit stuck if you were hoping to execute code from that RAM, though, so they're not *purely* acceleration, right? -- PMM -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 10 Nov 2014 14:16:58 +0100 Alexander Graf <agraf@suse.de> wrote: > > > On 10.11.14 13:31, Igor Mammedov wrote: > > On Fri, 7 Nov 2014 22:18:45 +0100 > > Alexander Graf <agraf@suse.de> wrote: > > > >> Memory slots have to be page aligned to get entered into KVM. There > >> is existing logic that tries to ensure that we pad memory slots that > >> are not page aligned to the biggest region that would still fit in the > >> alignment requirements. > >> > >> Unfortunately, that logic is broken. It tries to calculate the start > >> offset based on the region size. > >> > >> Fix up the logic to do the thing it was intended to do and document it > >> properly in the comment above it. > >> > >> With this patch applied, I can successfully run an e500 guest with more > >> than 3GB RAM (at which point RAM starts overlapping subpage memory regions). > >> > >> Cc: qemu-stable@nongnu.org > >> Signed-off-by: Alexander Graf <agraf@suse.de> > >> --- > >> kvm-all.c | 6 ++++-- > >> 1 file changed, 4 insertions(+), 2 deletions(-) > >> > >> diff --git a/kvm-all.c b/kvm-all.c > >> index 44a5e72..596e7ce 100644 > >> --- a/kvm-all.c > >> +++ b/kvm-all.c > >> @@ -634,8 +634,10 @@ static void kvm_set_phys_mem(MemoryRegionSection *section, bool add) > >> unsigned delta; > >> > >> /* kvm works in page size chunks, but the function may be called > >> - with sub-page size and unaligned start address. */ > >> - delta = TARGET_PAGE_ALIGN(size) - size; > >> + with sub-page size and unaligned start address. Pad the start > >> + address to next and truncate size to previous page boundary. */ > > I'm a bit confused how it works at all. > > Lets assume that there is no mapped pages that include start_addr, > > then if start_addr were padded to next page, kvm would map it from there > > but the rest of QEMU would still use unaligned start_addr for MemoryRegion > > that isn't even mapped. > > Sorry, I don't understand this paragraph. Memory slots in general are > accelerations for memory access - for MMIO (RAM is usually aligned), KVM > can always exit to QEMU and just do a manual MMIO exit. > > > It would seem that instead of padding up to the next page, start_addr > > should be moved to the start of the page that includes it to make page > > with original start_addr available to guest. > > No, because in that case you would map something as RAM that really > isn't RAM. > > Imagine you have the following memory layout: > > 0x1000 page size > > 1) 0x00000 - 0x10000 RAM > 2) 0x10000 - 0x10100 MMIO > 3) 0x10100 - 0x20000 RAM > > Then you want to map 1) as memory slot and 4) from 0x11000 onwards as > memory slot. so every access to RAM 0x10100-0x11000 which is not represented as memory slot would cause VMEXIT? > > You can't map the page from 0x10000 - 0x11000 as memory slot, because > part of it is MMIO. > > > Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10.11.14 14:55, Igor Mammedov wrote: > On Mon, 10 Nov 2014 14:16:58 +0100 > Alexander Graf <agraf@suse.de> wrote: > >> >> >> On 10.11.14 13:31, Igor Mammedov wrote: >>> On Fri, 7 Nov 2014 22:18:45 +0100 >>> Alexander Graf <agraf@suse.de> wrote: >>> >>>> Memory slots have to be page aligned to get entered into KVM. There >>>> is existing logic that tries to ensure that we pad memory slots that >>>> are not page aligned to the biggest region that would still fit in the >>>> alignment requirements. >>>> >>>> Unfortunately, that logic is broken. It tries to calculate the start >>>> offset based on the region size. >>>> >>>> Fix up the logic to do the thing it was intended to do and document it >>>> properly in the comment above it. >>>> >>>> With this patch applied, I can successfully run an e500 guest with more >>>> than 3GB RAM (at which point RAM starts overlapping subpage memory regions). >>>> >>>> Cc: qemu-stable@nongnu.org >>>> Signed-off-by: Alexander Graf <agraf@suse.de> >>>> --- >>>> kvm-all.c | 6 ++++-- >>>> 1 file changed, 4 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/kvm-all.c b/kvm-all.c >>>> index 44a5e72..596e7ce 100644 >>>> --- a/kvm-all.c >>>> +++ b/kvm-all.c >>>> @@ -634,8 +634,10 @@ static void kvm_set_phys_mem(MemoryRegionSection *section, bool add) >>>> unsigned delta; >>>> >>>> /* kvm works in page size chunks, but the function may be called >>>> - with sub-page size and unaligned start address. */ >>>> - delta = TARGET_PAGE_ALIGN(size) - size; >>>> + with sub-page size and unaligned start address. Pad the start >>>> + address to next and truncate size to previous page boundary. */ >>> I'm a bit confused how it works at all. >>> Lets assume that there is no mapped pages that include start_addr, >>> then if start_addr were padded to next page, kvm would map it from there >>> but the rest of QEMU would still use unaligned start_addr for MemoryRegion >>> that isn't even mapped. >> >> Sorry, I don't understand this paragraph. Memory slots in general are >> accelerations for memory access - for MMIO (RAM is usually aligned), KVM >> can always exit to QEMU and just do a manual MMIO exit. >> >>> It would seem that instead of padding up to the next page, start_addr >>> should be moved to the start of the page that includes it to make page >>> with original start_addr available to guest. >> >> No, because in that case you would map something as RAM that really >> isn't RAM. >> >> Imagine you have the following memory layout: >> >> 0x1000 page size >> >> 1) 0x00000 - 0x10000 RAM >> 2) 0x10000 - 0x10100 MMIO >> 3) 0x10100 - 0x20000 RAM >> >> Then you want to map 1) as memory slot and 4) from 0x11000 onwards as >> memory slot. > so every access to RAM 0x10100-0x11000 which is not represented as memory > slot would cause VMEXIT? Yes, there's no other way. Otherwise we wouldn't be able to trap on the exits from 0x10000 - 0x10100. Hardware only gives us page granularity. Usually this isn't an issue because overlapping MMIO regions are pretty large chunks of power-of-2 size - if you see any overlapping at all. On e500 this bites us though, because we end up with small MSI-X windows inside our address space (which in turn might also be a bug, but that doesn't mean that the slot mapping logic should be left as broken as it is). Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10.11.14 14:55, Peter Maydell wrote: > On 10 November 2014 13:16, Alexander Graf <agraf@suse.de> wrote: >> Sorry, I don't understand this paragraph. Memory slots in general are >> accelerations for memory access - for MMIO (RAM is usually aligned), KVM >> can always exit to QEMU and just do a manual MMIO exit. > > ...you're a bit stuck if you were hoping to execute code from > that RAM, though, so they're not *purely* acceleration, right? Yes and no. Technically, there's no reason KVM couldn't do an MMIO exit dance to fetch the next instruction. From user space this should be indistinguishable. Today, I don't think it's implemented though :). Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/kvm-all.c b/kvm-all.c index 44a5e72..596e7ce 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -634,8 +634,10 @@ static void kvm_set_phys_mem(MemoryRegionSection *section, bool add) unsigned delta; /* kvm works in page size chunks, but the function may be called - with sub-page size and unaligned start address. */ - delta = TARGET_PAGE_ALIGN(size) - size; + with sub-page size and unaligned start address. Pad the start + address to next and truncate size to previous page boundary. */ + delta = (TARGET_PAGE_SIZE - (start_addr & ~TARGET_PAGE_MASK)); + delta &= ~TARGET_PAGE_MASK; if (delta > size) { return; }
Memory slots have to be page aligned to get entered into KVM. There is existing logic that tries to ensure that we pad memory slots that are not page aligned to the biggest region that would still fit in the alignment requirements. Unfortunately, that logic is broken. It tries to calculate the start offset based on the region size. Fix up the logic to do the thing it was intended to do and document it properly in the comment above it. With this patch applied, I can successfully run an e500 guest with more than 3GB RAM (at which point RAM starts overlapping subpage memory regions). Cc: qemu-stable@nongnu.org Signed-off-by: Alexander Graf <agraf@suse.de> --- kvm-all.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)