Message ID | 560E9004.8030604@citrix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi David, On 02/10/15 15:09, David Vrabel wrote: > On 30/09/15 11:45, Julien Grall wrote: >> For ARM64 guests, Linux is able to support either 64K or 4K page >> granularity. Although, the hypercall interface is always based on 4K >> page granularity. >> >> With 64K page granularity, a single page will be spread over multiple >> Xen frame. >> >> To avoid splitting the page into 4K frame, take advantage of the >> extent_order field to directly allocate/free chunk of the Linux page >> size. >> >> Note that PVMMU is only used for PV guest (which is x86) and the page >> granularity is always 4KB. Some BUILD_BUG_ON has been added to ensure >> that because the code has not been modified. > > This causes a BUG() in x86 PV guests when decreasing the reservation. > > Xen says: > > (XEN) d0v2 Error pfn 0: rd=0 od=32753 caf=8000000000000001 > taf=7400000000000001 > (XEN) memory.c:250:d0v2 Bad page free for domain 0 > > And Linux BUGs with: > > [ 82.032654] kernel BUG at > /anfs/drall/scratch/davidvr/x86/linux/drivers/xen/balloon.c:540! > > Which is a non-zero return value from the decrease_reservation hypercall. > > The frame_list[] has been incorrectly populated. The below patch fixes > it for me. Please test as well. Sorry for the breakage, I think I haven't spot the bug on my board because most the PV drivers are allocating one balloon page at the time by default. This patch looks valid to me. i was resetting and incremented for each loop on an early version. Although I dropped it by mistake when I use a different way to decrease the reservation. Regards,
On 02/10/15 15:31, Julien Grall wrote: > Hi David, > > On 02/10/15 15:09, David Vrabel wrote: >> On 30/09/15 11:45, Julien Grall wrote: >>> For ARM64 guests, Linux is able to support either 64K or 4K page >>> granularity. Although, the hypercall interface is always based on 4K >>> page granularity. >>> >>> With 64K page granularity, a single page will be spread over multiple >>> Xen frame. >>> >>> To avoid splitting the page into 4K frame, take advantage of the >>> extent_order field to directly allocate/free chunk of the Linux page >>> size. >>> >>> Note that PVMMU is only used for PV guest (which is x86) and the page >>> granularity is always 4KB. Some BUILD_BUG_ON has been added to ensure >>> that because the code has not been modified. >> >> This causes a BUG() in x86 PV guests when decreasing the reservation. >> >> Xen says: >> >> (XEN) d0v2 Error pfn 0: rd=0 od=32753 caf=8000000000000001 >> taf=7400000000000001 >> (XEN) memory.c:250:d0v2 Bad page free for domain 0 >> >> And Linux BUGs with: >> >> [ 82.032654] kernel BUG at >> /anfs/drall/scratch/davidvr/x86/linux/drivers/xen/balloon.c:540! >> >> Which is a non-zero return value from the decrease_reservation hypercall. >> >> The frame_list[] has been incorrectly populated. The below patch fixes >> it for me. Please test as well. FIY, I've just tested with the patch on ARM64 and I haven't see any issue. > Sorry for the breakage, I think I haven't spot the bug on my board > because most the PV drivers are allocating one balloon page at the time > by default. > > This patch looks valid to me. i was resetting and incremented for each > loop on an early version. Although I dropped it by mistake when I use a > different way to decrease the reservation.
On 10/02/2015 10:52 AM, Julien Grall wrote: > On 02/10/15 15:31, Julien Grall wrote: >> Hi David, >> >> On 02/10/15 15:09, David Vrabel wrote: >>> On 30/09/15 11:45, Julien Grall wrote: >>>> For ARM64 guests, Linux is able to support either 64K or 4K page >>>> granularity. Although, the hypercall interface is always based on 4K >>>> page granularity. >>>> >>>> With 64K page granularity, a single page will be spread over multiple >>>> Xen frame. >>>> >>>> To avoid splitting the page into 4K frame, take advantage of the >>>> extent_order field to directly allocate/free chunk of the Linux page >>>> size. >>>> >>>> Note that PVMMU is only used for PV guest (which is x86) and the page >>>> granularity is always 4KB. Some BUILD_BUG_ON has been added to ensure >>>> that because the code has not been modified. >>> This causes a BUG() in x86 PV guests when decreasing the reservation. >>> >>> Xen says: >>> >>> (XEN) d0v2 Error pfn 0: rd=0 od=32753 caf=8000000000000001 >>> taf=7400000000000001 >>> (XEN) memory.c:250:d0v2 Bad page free for domain 0 >>> >>> And Linux BUGs with: >>> >>> [ 82.032654] kernel BUG at >>> /anfs/drall/scratch/davidvr/x86/linux/drivers/xen/balloon.c:540! >>> >>> Which is a non-zero return value from the decrease_reservation hypercall. >>> >>> The frame_list[] has been incorrectly populated. The below patch fixes >>> it for me. Please test as well. > FIY, I've just tested with the patch on ARM64 and I haven't see any issue. I had a quick one-off test and this fixes it on x86. I'll schedule it for the overnight run too. -boris > >> Sorry for the breakage, I think I haven't spot the bug on my board >> because most the PV drivers are allocating one balloon page at the time >> by default. >> >> This patch looks valid to me. i was resetting and incremented for each >> loop on an early version. Although I dropped it by mistake when I use a >> different way to decrease the reservation. > > >
On 02/10/15 15:52, Julien Grall wrote: > On 02/10/15 15:31, Julien Grall wrote: >> Hi David, >> >> On 02/10/15 15:09, David Vrabel wrote: >>> On 30/09/15 11:45, Julien Grall wrote: >>>> For ARM64 guests, Linux is able to support either 64K or 4K page >>>> granularity. Although, the hypercall interface is always based on 4K >>>> page granularity. >>>> >>>> With 64K page granularity, a single page will be spread over multiple >>>> Xen frame. >>>> >>>> To avoid splitting the page into 4K frame, take advantage of the >>>> extent_order field to directly allocate/free chunk of the Linux page >>>> size. >>>> >>>> Note that PVMMU is only used for PV guest (which is x86) and the page >>>> granularity is always 4KB. Some BUILD_BUG_ON has been added to ensure >>>> that because the code has not been modified. >>> >>> This causes a BUG() in x86 PV guests when decreasing the reservation. >>> >>> Xen says: >>> >>> (XEN) d0v2 Error pfn 0: rd=0 od=32753 caf=8000000000000001 >>> taf=7400000000000001 >>> (XEN) memory.c:250:d0v2 Bad page free for domain 0 >>> >>> And Linux BUGs with: >>> >>> [ 82.032654] kernel BUG at >>> /anfs/drall/scratch/davidvr/x86/linux/drivers/xen/balloon.c:540! >>> >>> Which is a non-zero return value from the decrease_reservation hypercall. >>> >>> The frame_list[] has been incorrectly populated. The below patch fixes >>> it for me. Please test as well. > > FIY, I've just tested with the patch on ARM64 and I haven't see any issue. Thanks, I've folded it in. Boris, can we get another test run on the for-linus-4.4 branch, please? David
--- a/drivers/xen/balloon.c +++ b/drivers/xen/balloon.c @@ -504,9 +504,10 @@ static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp) * Setup the frame, update direct mapping, invalidate P2M, * and add to balloon. */ + i = 0; list_for_each_entry_safe(page, tmp, &pages, lru) { /* XENMEM_decrease_reservation requires a GFN */ - frame_list[i] = xen_page_to_gfn(page); + frame_list[i++] = xen_page_to_gfn(page); #ifdef CONFIG_XEN_HAVE_PVMMU