From patchwork Tue Mar 7 13:00:09 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 9608871 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id EC7ED6046A for ; Tue, 7 Mar 2017 13:02:38 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 829DA284F5 for ; Tue, 7 Mar 2017 13:02:38 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 75825284F9; Tue, 7 Mar 2017 13:02:38 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.6 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED,RCVD_IN_SORBS_SPAM,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id B0142284F5 for ; Tue, 7 Mar 2017 13:02:37 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1clEiu-0006cD-9A; Tue, 07 Mar 2017 13:00:16 +0000 Received: from mail6.bemta6.messagelabs.com ([193.109.254.103]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1clEis-0006bx-QP for xen-devel@lists.xen.org; Tue, 07 Mar 2017 13:00:14 +0000 Received: from [85.158.143.35] by server-10.bemta-6.messagelabs.com id 44/6A-13192-EDEAEB85; Tue, 07 Mar 2017 13:00:14 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFvrEIsWRWlGSWpSXmKPExsXiVRvkont33b4 Ig+nLFS2WfFzM4sDocXT3b6YAxijWzLyk/IoE1oxV75axFywyrWjbe4m9gXGVZhcjF4eQQD+j xJXlm1hBHBaBXawSp6/tBXI4OSQEdrNKXGpO7GLkALJjJBpvm0KESyVuL5zLDGILCShLzFv+m xVi0DImiddLPzJBJIoldjY2soHYLAIqElsuTGcHsdkEDCX6Ny4HaxYRiJW48W0KG0gzs8B9Fo kp77+CNQgLpEi8WjkJbBCvgIZE7+23jBBDNzBKfJynAhEXlDg58wkLiM0soCVx499LJpBDmQW kJZb/4wAJcwrYSdxuuwE2UlRATaLr+Rz2CYwis5B0z0LSPQuhewEj8ypGjeLUorLUIl0jU72k osz0jJLcxMwcXUMDM73c1OLixPTUnMSkYr3k/NxNjMDwZwCCHYyrFgQeYpTkYFIS5f3xZE+EE F9SfkplRmJxRnxRaU5q8SFGGQ4OJQnely+AcoJFqempFWmZOcBIhElLcPAoifBeBUnzFhck5h ZnpkOkTjHqcnzoP/yGSYglLz8vVUqc9xBIkQBIUUZpHtwIWFK4xCgrJczLCHSUEE9BalFuZgm q/CtGcQ5GJWFe1ZdAU3gy80rgNr0COoIJ6Ag/mZ0gR5QkIqSkGhinifjf0rOrWLfg9yu78HP/ H2bcPPtdXXf98SVKko5c/c131hmdXvH2+i6PwqtfS893aHPp6fOdul3jFa7T2PKRUc3TrGbb5 SWC3b+UGFOjlQtfaKZ/aOWwmb6TWU+1XuBhBQe/mqeV8OVtMW/VNULmLzP5tkzTW/DflW+nOg 1cuV5WMZyZMF2JpTgj0VCLuag4EQCwwxC7BQMAAA== X-Env-Sender: kirill@shutemov.name X-Msg-Ref: server-9.tower-21.messagelabs.com!1488891613!60948832!1 X-Originating-IP: [74.125.82.68] X-SpamReason: No, hits=0.0 required=7.0 tests= X-StarScan-Received: X-StarScan-Version: 9.2.3; banners=-,-,- X-VirusChecked: Checked Received: (qmail 8179 invoked from network); 7 Mar 2017 13:00:13 -0000 Received: from mail-wm0-f68.google.com (HELO mail-wm0-f68.google.com) (74.125.82.68) by server-9.tower-21.messagelabs.com with AES128-GCM-SHA256 encrypted SMTP; 7 Mar 2017 13:00:13 -0000 Received: by mail-wm0-f68.google.com with SMTP id n11so863123wma.0 for ; Tue, 07 Mar 2017 05:00:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=shutemov-name.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=jd3YMud4Annv8RhQr+3Oym9sT7hVdbGEH6Yfkjmd8EA=; b=w4g/XKBBr8Gvl4gWQCuLuImN0ArhcnE9hV5TolGR4d2luzr0p5p6S5Y9VqymRbQTl0 dXl0Kzhvw7ZrmMNjFNvAjVZJvtURtedYNN6mH6nsxQUrRctBXw9FJsLXx6If81r/zTeU GFfq6meSgF7NK4pOSOk8E6E4cCKO6HWJTXgcO8vnLcBaGoDeG/kThff1uxDHYhwvOuAd 9i9VHlcUMS7lgvjsljhmQbHpipfEcpTdz4j91x0jCOZzawBRo4dVVNBoDn3xBwgQx/Cd 0mrzvc7J3uof6zu61ZBrjBUgt4SFTm0QlGiTkU3sH12TxKSSEwEftB9g1g2YxDUH0396 ckpw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=jd3YMud4Annv8RhQr+3Oym9sT7hVdbGEH6Yfkjmd8EA=; b=kIJzYtqCK8yYLRkt9KoizWkTDuYXkkr/Sj+2mOt/YOhMxXNyjoJLl1NTabHiUx8YU3 /D4JXEudc/6KMgoXY/9GkQoiWT0meJl+i1u4TC3R77HfLaj3hzJXHyUVnEagVdLyZzB/ GtsNDXivI//Qq02htSxb9vFCifM3RWzLtsMTCUFVLBjqfK+b5VYHyet2rX+Av1jiiUp+ 62aQDTC9K9IMQOxba70XmKJmM6eXw5N8FmKo2iILSjeVURONgytFjJsGZQqNknaNMHLg g7povAbmLum4Flb8ym6cq5fgG4HHuliSWCaOpVbPCq+DdrMamiOFDm9Gk5u1nwDYaK5A cUig== X-Gm-Message-State: AMke39m/fFaRV/sP5xl8zwYZWk+PrFUSZ2z9BonJSvhI+GmccE9zcQjIy2VRVppAqo6b8g== X-Received: by 10.28.91.1 with SMTP id p1mr3195408wmb.63.1488891613020; Tue, 07 Mar 2017 05:00:13 -0800 (PST) Received: from node.shutemov.name (mm-46-93-44-37.mgts.dynamic.pppoe.byfly.by. [37.44.93.46]) by smtp.gmail.com with ESMTPSA id i203sm480716wmf.12.2017.03.07.05.00.12 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 07 Mar 2017 05:00:12 -0800 (PST) Received: by node.shutemov.name (Postfix, from userid 1000) id C91E664636F2; Tue, 7 Mar 2017 16:00:09 +0300 (+03) Date: Tue, 7 Mar 2017 16:00:09 +0300 From: "Kirill A. Shutemov" To: Boris Ostrovsky , "Zhang, Xiong Y" Message-ID: <20170307130009.GA2154@node> References: <20170306135357.3124-1-kirill.shutemov@linux.intel.com> <20170306135357.3124-19-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23.1 (2014-03-12) Cc: linux-arch@vger.kernel.org, Juergen Gross , Andi Kleen , Arnd Bergmann , linux-mm@kvack.org, x86@kernel.org, linux-kernel@vger.kernel.org, Andy Lutomirski , Dave Hansen , Ingo Molnar , "H. Peter Anvin" , Xiong Zhang , Andrew Morton , xen-devel , Linus Torvalds , Thomas Gleixner , "Kirill A. Shutemov" Subject: Re: [Xen-devel] [PATCHv4 18/33] x86/xen: convert __xen_pgd_walk() and xen_cleanmfnmap() to support p4d X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP On Mon, Mar 06, 2017 at 03:48:24PM -0500, Boris Ostrovsky wrote: > > > +static int xen_p4d_walk(struct mm_struct *mm, p4d_t *p4d, > > + int (*func)(struct mm_struct *mm, struct page *, enum pt_level), > > + bool last, unsigned long limit) > > +{ > > + int i, nr, flush = 0; > > + > > + nr = last ? p4d_index(limit) + 1 : PTRS_PER_P4D; > > + for (i = 0; i < nr; i++) { > > + pud_t *pud; > > + > > + if (p4d_none(p4d[i])) > > + continue; > > + > > + pud = pud_offset(&p4d[i], 0); > > + if (PTRS_PER_PUD > 1) > > + flush |= (*func)(mm, virt_to_page(pud), PT_PUD); > > + xen_pud_walk(mm, pud, func, last && i == nr - 1, limit); > > + } > > + return flush; > > +} > > .. > > > + p4d = p4d_offset(&pgd[i], 0); > > + if (PTRS_PER_P4D > 1) > > + flush |= (*func)(mm, virt_to_page(p4d), PT_P4D); > > + xen_p4d_walk(mm, p4d, func, i == nr - 1, limit); > > > We are losing flush status at all levels so we need something like > > flush |= xen_XXX_walk(...) + Xiong. Thanks for noticing this. The fixup is below. Please test, I don't have a setup for this. > > > > > } > > > > -out: > > /* Do the top level last, so that the callbacks can use it as > > a cue to do final things like tlb flushes. */ > > flush |= (*func)(mm, virt_to_page(pgd), PT_PGD); > > @@ -1150,57 +1161,97 @@ static void __init xen_cleanmfnmap_free_pgtbl(void *pgtbl, bool unpin) > > xen_free_ro_pages(pa, PAGE_SIZE); > > } > > > > +static void __init xen_cleanmfnmap_pmd(pmd_t *pmd, bool unpin) > > +{ > > + unsigned long pa; > > + pte_t *pte_tbl; > > + int i; > > + > > + if (pmd_large(*pmd)) { > > + pa = pmd_val(*pmd) & PHYSICAL_PAGE_MASK; > > + xen_free_ro_pages(pa, PMD_SIZE); > > + return; > > + } > > + > > + pte_tbl = pte_offset_kernel(pmd, 0); > > + for (i = 0; i < PTRS_PER_PTE; i++) { > > + if (pte_none(pte_tbl[i])) > > + continue; > > + pa = pte_pfn(pte_tbl[i]) << PAGE_SHIFT; > > + xen_free_ro_pages(pa, PAGE_SIZE); > > + } > > + set_pmd(pmd, __pmd(0)); > > + xen_cleanmfnmap_free_pgtbl(pte_tbl, unpin); > > +} > > + > > +static void __init xen_cleanmfnmap_pud(pud_t *pud, bool unpin) > > +{ > > + unsigned long pa; > > + pmd_t *pmd_tbl; > > + int i; > > + > > + if (pud_large(*pud)) { > > + pa = pud_val(*pud) & PHYSICAL_PAGE_MASK; > > + xen_free_ro_pages(pa, PUD_SIZE); > > + return; > > + } > > + > > + pmd_tbl = pmd_offset(pud, 0); > > + for (i = 0; i < PTRS_PER_PMD; i++) { > > + if (pmd_none(pmd_tbl[i])) > > + continue; > > + xen_cleanmfnmap_pmd(pmd_tbl + i, unpin); > > + } > > + set_pud(pud, __pud(0)); > > + xen_cleanmfnmap_free_pgtbl(pmd_tbl, unpin); > > +} > > + > > +static void __init xen_cleanmfnmap_p4d(p4d_t *p4d, bool unpin) > > +{ > > + unsigned long pa; > > + pud_t *pud_tbl; > > + int i; > > + > > + if (p4d_large(*p4d)) { > > + pa = p4d_val(*p4d) & PHYSICAL_PAGE_MASK; > > + xen_free_ro_pages(pa, P4D_SIZE); > > + return; > > + } > > + > > + pud_tbl = pud_offset(p4d, 0); > > + for (i = 0; i < PTRS_PER_PUD; i++) { > > + if (pud_none(pud_tbl[i])) > > + continue; > > + xen_cleanmfnmap_pud(pud_tbl + i, unpin); > > + } > > + set_p4d(p4d, __p4d(0)); > > + xen_cleanmfnmap_free_pgtbl(pud_tbl, unpin); > > +} > > + > > /* > > * Since it is well isolated we can (and since it is perhaps large we should) > > * also free the page tables mapping the initial P->M table. > > */ > > static void __init xen_cleanmfnmap(unsigned long vaddr) > > { > > - unsigned long va = vaddr & PMD_MASK; > > - unsigned long pa; > > - pgd_t *pgd = pgd_offset_k(va); > > - pud_t *pud_page = pud_offset(pgd, 0); > > - pud_t *pud; > > - pmd_t *pmd; > > - pte_t *pte; > > + pgd_t *pgd; > > + p4d_t *p4d; > > unsigned int i; > > bool unpin; > > > > unpin = (vaddr == 2 * PGDIR_SIZE); > > - set_pgd(pgd, __pgd(0)); > > - do { > > - pud = pud_page + pud_index(va); > > - if (pud_none(*pud)) { > > - va += PUD_SIZE; > > - } else if (pud_large(*pud)) { > > - pa = pud_val(*pud) & PHYSICAL_PAGE_MASK; > > - xen_free_ro_pages(pa, PUD_SIZE); > > - va += PUD_SIZE; > > - } else { > > - pmd = pmd_offset(pud, va); > > - if (pmd_large(*pmd)) { > > - pa = pmd_val(*pmd) & PHYSICAL_PAGE_MASK; > > - xen_free_ro_pages(pa, PMD_SIZE); > > - } else if (!pmd_none(*pmd)) { > > - pte = pte_offset_kernel(pmd, va); > > - set_pmd(pmd, __pmd(0)); > > - for (i = 0; i < PTRS_PER_PTE; ++i) { > > - if (pte_none(pte[i])) > > - break; > > - pa = pte_pfn(pte[i]) << PAGE_SHIFT; > > - xen_free_ro_pages(pa, PAGE_SIZE); > > - } > > - xen_cleanmfnmap_free_pgtbl(pte, unpin); > > - } > > - va += PMD_SIZE; > > - if (pmd_index(va)) > > - continue; > > - set_pud(pud, __pud(0)); > > - xen_cleanmfnmap_free_pgtbl(pmd, unpin); > > - } > > - > > - } while (pud_index(va) || pmd_index(va)); > > - xen_cleanmfnmap_free_pgtbl(pud_page, unpin); > > + vaddr &= PMD_MASK; > > + pgd = pgd_offset_k(vaddr); > > + p4d = p4d_offset(pgd, 0); > > + for (i = 0; i < PTRS_PER_P4D; i++) { > > + if (p4d_none(p4d[i])) > > + continue; > > + xen_cleanmfnmap_p4d(p4d + i, unpin); > > + } > > Don't we need to pass vaddr down to all routines so that they select > appropriate tables? You seem to always be choosing the first one. IIUC, we clear whole page table subtree covered by one pgd entry. So, no, there's no need to pass vaddr down. Just pointer to page table entry is enough. But I know virtually nothing about Xen. Please re-check my reasoning. I would also appreciate help with getting x86 Xen code work with 5-level paging enabled. For now I make CONFIG_XEN dependent on !CONFIG_X86_5LEVEL. Fixup: diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c index a4079cfab007..d66b7e79781a 100644 --- a/arch/x86/xen/mmu.c +++ b/arch/x86/xen/mmu.c @@ -629,7 +629,8 @@ static int xen_pud_walk(struct mm_struct *mm, pud_t *pud, pmd = pmd_offset(&pud[i], 0); if (PTRS_PER_PMD > 1) flush |= (*func)(mm, virt_to_page(pmd), PT_PMD); - xen_pmd_walk(mm, pmd, func, last && i == nr - 1, limit); + flush |= xen_pmd_walk(mm, pmd, func, + last && i == nr - 1, limit); } return flush; } @@ -650,7 +651,8 @@ static int xen_p4d_walk(struct mm_struct *mm, p4d_t *p4d, pud = pud_offset(&p4d[i], 0); if (PTRS_PER_PUD > 1) flush |= (*func)(mm, virt_to_page(pud), PT_PUD); - xen_pud_walk(mm, pud, func, last && i == nr - 1, limit); + flush |= xen_pud_walk(mm, pud, func, + last && i == nr - 1, limit); } return flush; } @@ -706,7 +708,7 @@ static int __xen_pgd_walk(struct mm_struct *mm, pgd_t *pgd, p4d = p4d_offset(&pgd[i], 0); if (PTRS_PER_P4D > 1) flush |= (*func)(mm, virt_to_page(p4d), PT_P4D); - xen_p4d_walk(mm, p4d, func, i == nr - 1, limit); + flush |= xen_p4d_walk(mm, p4d, func, i == nr - 1, limit); } /* Do the top level last, so that the callbacks can use it as