Message ID | 20170307130009.GA2154@node (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
>> Don't we need to pass vaddr down to all routines so that they select >> appropriate tables? You seem to always be choosing the first one. > IIUC, we clear whole page table subtree covered by one pgd entry. > So, no, there's no need to pass vaddr down. Just pointer to page table > entry is enough. > > But I know virtually nothing about Xen. Please re-check my reasoning. Yes, we effectively remove the whole page table for vaddr so I guess it's OK. > > I would also appreciate help with getting x86 Xen code work with 5-level > paging enabled. For now I make CONFIG_XEN dependent on !CONFIG_X86_5LEVEL. Hmmm... that's a problem since this requires changes in the hypervisor and even if/when these changes are made older version of hypervisor still will not be able to run those guests. This affects only PV guests and there is a series under review that provides clean code separation with CONFIG_XEN_PV but because, for example, dom0 (Xen control domain) is PV this will significantly limit availability of dom0-capable kernels (because I assume distros will want to have CONFIG_X86_5LEVEL). > > Fixup: Yes, that works. (But then it worked even without this change because problems caused by missing the flush would be intermittent. And a joy to debug). -boris
On 07/03/17 18:18, Boris Ostrovsky wrote: >>> Don't we need to pass vaddr down to all routines so that they select >>> appropriate tables? You seem to always be choosing the first one. >> IIUC, we clear whole page table subtree covered by one pgd entry. >> So, no, there's no need to pass vaddr down. Just pointer to page table >> entry is enough. >> >> But I know virtually nothing about Xen. Please re-check my reasoning. > Yes, we effectively remove the whole page table for vaddr so I guess > it's OK. > >> I would also appreciate help with getting x86 Xen code work with 5-level >> paging enabled. For now I make CONFIG_XEN dependent on !CONFIG_X86_5LEVEL. > Hmmm... that's a problem since this requires changes in the hypervisor > and even if/when these changes are made older version of hypervisor > still will not be able to run those guests. > > This affects only PV guests and there is a series under review that > provides clean code separation with CONFIG_XEN_PV but because, for > example, dom0 (Xen control domain) is PV this will significantly limit > availability of dom0-capable kernels (because I assume distros will want > to have CONFIG_X86_5LEVEL). Wasn't the plan to be able to automatically detect 4 vs 5 level support, and cope either way, so distros didn't have to ship two different builds of Linux? If so, all we need to do git things to compile sensibly, and have the PV entry code in Linux configure the rest of the kernel appropriately. (If not, please ignore me.) ~Andrew
On 03/07/2017 01:26 PM, Andrew Cooper wrote: > On 07/03/17 18:18, Boris Ostrovsky wrote: >>>> Don't we need to pass vaddr down to all routines so that they select >>>> appropriate tables? You seem to always be choosing the first one. >>> IIUC, we clear whole page table subtree covered by one pgd entry. >>> So, no, there's no need to pass vaddr down. Just pointer to page table >>> entry is enough. >>> >>> But I know virtually nothing about Xen. Please re-check my reasoning. >> Yes, we effectively remove the whole page table for vaddr so I guess >> it's OK. >> >>> I would also appreciate help with getting x86 Xen code work with 5-level >>> paging enabled. For now I make CONFIG_XEN dependent on !CONFIG_X86_5LEVEL. >> Hmmm... that's a problem since this requires changes in the hypervisor >> and even if/when these changes are made older version of hypervisor >> still will not be able to run those guests. >> >> This affects only PV guests and there is a series under review that >> provides clean code separation with CONFIG_XEN_PV but because, for >> example, dom0 (Xen control domain) is PV this will significantly limit >> availability of dom0-capable kernels (because I assume distros will want >> to have CONFIG_X86_5LEVEL). > Wasn't the plan to be able to automatically detect 4 vs 5 level support, > and cope either way, so distros didn't have to ship two different builds > of Linux? > > If so, all we need to do git things to compile sensibly, and have the PV > entry code in Linux configure the rest of the kernel appropriately. I am not aware of any plans but this would obviously be the preferred route. -boris
diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c index a4079cfab007..d66b7e79781a 100644 --- a/arch/x86/xen/mmu.c +++ b/arch/x86/xen/mmu.c @@ -629,7 +629,8 @@ static int xen_pud_walk(struct mm_struct *mm, pud_t *pud, pmd = pmd_offset(&pud[i], 0); if (PTRS_PER_PMD > 1) flush |= (*func)(mm, virt_to_page(pmd), PT_PMD); - xen_pmd_walk(mm, pmd, func, last && i == nr - 1, limit); + flush |= xen_pmd_walk(mm, pmd, func, + last && i == nr - 1, limit); } return flush; } @@ -650,7 +651,8 @@ static int xen_p4d_walk(struct mm_struct *mm, p4d_t *p4d, pud = pud_offset(&p4d[i], 0); if (PTRS_PER_PUD > 1) flush |= (*func)(mm, virt_to_page(pud), PT_PUD); - xen_pud_walk(mm, pud, func, last && i == nr - 1, limit); + flush |= xen_pud_walk(mm, pud, func, + last && i == nr - 1, limit); } return flush; } @@ -706,7 +708,7 @@ static int __xen_pgd_walk(struct mm_struct *mm, pgd_t *pgd, p4d = p4d_offset(&pgd[i], 0); if (PTRS_PER_P4D > 1) flush |= (*func)(mm, virt_to_page(p4d), PT_P4D); - xen_p4d_walk(mm, p4d, func, i == nr - 1, limit); + flush |= xen_p4d_walk(mm, p4d, func, i == nr - 1, limit); } /* Do the top level last, so that the callbacks can use it as