From patchwork Fri Aug 23 22:52:46 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nadav Amit X-Patchwork-Id: 11112891 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B899D13A4 for ; Sat, 24 Aug 2019 06:15:21 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 879E421670 for ; Sat, 24 Aug 2019 06:15:21 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 879E421670 Authentication-Results: mail.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=vmware.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1i1PIk-0005HF-O5; Sat, 24 Aug 2019 06:13:26 +0000 Received: from all-amaz-eas1.inumbo.com ([34.197.232.57] helo=us1-amaz-eas2.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1i1PIi-0005HA-Tu for xen-devel@lists.xenproject.org; Sat, 24 Aug 2019 06:13:24 +0000 X-Inumbo-ID: 4714a6ac-c636-11e9-adf1-12813bfff9fa Received: from mail-pl1-f194.google.com (unknown [209.85.214.194]) by us1-amaz-eas2.inumbo.com (Halon) with ESMTPS id 4714a6ac-c636-11e9-adf1-12813bfff9fa; Sat, 24 Aug 2019 06:13:23 +0000 (UTC) Received: by mail-pl1-f194.google.com with SMTP id bj8so6888359plb.4 for ; Fri, 23 Aug 2019 23:13:23 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=INVtJHgm0b2/J/2q5zY54fqZ7ZxoyrSsmyIhQis+AsE=; b=UxinXedy2MEXIZ4IBqEYeoK/QFG5RS+VI+t9AM7I0GIDUVFy4CUvM4dEyF7uWvQKzS nfykEGKNJGx1cbZ2hoZLp5P6w7a/HTSaycdy8TmmAftJT/F8PIChxbdVrobCnub8bDsD mL7iVcE1mCl5OYAdpHGoQuG2I0nLQ5dHzaHB8kpITeIQh0UamR+4VYnoB2DgMp/1LTIC WkqMT4mtfwVuPC/dHCBbBz3eXjNHQ+1KSGD7w2acONp2ARGZovy4w3ys+yWZckV141TJ MujlDlxku9BY1JT2SngV/s3fmbt1YZQkxiqLG61I9WoVJhYO9D9MzejBucYJ8T9YUIsh qR/w== X-Gm-Message-State: APjAAAUOctJa4KQLigeRBh0sKhcgH5O4T0640uxp3VDZYm44izc4XFt9 8D3HPiTMpW7F+7RpjFJyRf0= X-Google-Smtp-Source: APXvYqytAgy02S76SH8accLSf7pTZnLDb+ENFHy9Yggx8T6t29mNQNPLuwfRP/1hMekGtarTgvgKRw== X-Received: by 2002:a17:902:b186:: with SMTP id s6mr8683551plr.343.1566627202825; Fri, 23 Aug 2019 23:13:22 -0700 (PDT) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id o9sm3691360pgv.19.2019.08.23.23.13.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 23 Aug 2019 23:13:22 -0700 (PDT) From: Nadav Amit To: Andy Lutomirski , Dave Hansen Date: Fri, 23 Aug 2019 15:52:46 -0700 Message-Id: <20190823225248.15597-2-namit@vmware.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190823225248.15597-1-namit@vmware.com> References: <20190823225248.15597-1-namit@vmware.com> Subject: [Xen-devel] [RFC PATCH v2 1/3] x86/mm/tlb: Change __flush_tlb_one_user interface X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Juergen Gross , Stefano Stabellini , Peter Zijlstra , x86@kernel.org, linux-kernel@vger.kernel.org, Ingo Molnar , Nadav Amit , xen-devel@lists.xenproject.org, Thomas Gleixner , Boris Ostrovsky MIME-Version: 1.0 Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" __flush_tlb_one_user() currently flushes a single entry, and flushes it both in the kernel and user page-tables, when PTI is enabled. Change __flush_tlb_one_user() and related interfaces into __flush_tlb_range() that flushes a range and does not flush the user page-table. This refactoring is needed for the next patch, but regardless makes sense and has several advantages. First, only Xen-PV, which does not use PTI, implements the paravirtual interface of flush_tlb_one_user() so nothing is broken by separating the user and kernel page-table flushes, and the interface is more intuitive. Second, INVLPG can flush unrelated mappings, and it is also a serializing instruction. It is better to have a tight loop that flushes the entries. Third, currently __flush_tlb_one_kernel() also flushes the user page-tables, which is not needed. This allows to avoid this redundant flush. Cc: Boris Ostrovsky Cc: Juergen Gross Cc: Stefano Stabellini Cc: xen-devel@lists.xenproject.org Signed-off-by: Nadav Amit --- arch/x86/include/asm/paravirt.h | 5 ++-- arch/x86/include/asm/paravirt_types.h | 3 ++- arch/x86/include/asm/tlbflush.h | 24 +++++------------ arch/x86/kernel/paravirt.c | 7 ++--- arch/x86/mm/tlb.c | 39 ++++++++++++++++++++++----- arch/x86/xen/mmu_pv.c | 21 +++++++++------ 6 files changed, 62 insertions(+), 37 deletions(-) diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h index bc4829c9b3f9..7347328eacd3 100644 --- a/arch/x86/include/asm/paravirt.h +++ b/arch/x86/include/asm/paravirt.h @@ -57,9 +57,10 @@ static inline void __flush_tlb_global(void) PVOP_VCALL0(mmu.flush_tlb_kernel); } -static inline void __flush_tlb_one_user(unsigned long addr) +static inline void __flush_tlb_range(unsigned long start, unsigned long end, + u8 stride_shift) { - PVOP_VCALL1(mmu.flush_tlb_one_user, addr); + PVOP_VCALL3(mmu.flush_tlb_range, start, end, stride_shift); } static inline void flush_tlb_multi(const struct cpumask *cpumask, diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h index 63fa751344bf..a87a5f236251 100644 --- a/arch/x86/include/asm/paravirt_types.h +++ b/arch/x86/include/asm/paravirt_types.h @@ -205,7 +205,8 @@ struct pv_mmu_ops { /* TLB operations */ void (*flush_tlb_user)(void); void (*flush_tlb_kernel)(void); - void (*flush_tlb_one_user)(unsigned long addr); + void (*flush_tlb_range)(unsigned long start, unsigned long end, + u8 stride_shift); void (*flush_tlb_multi)(const struct cpumask *cpus, const struct flush_tlb_info *info); diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index 1f88ea410ff3..421bc82504e2 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -145,7 +145,7 @@ static inline unsigned long build_cr3_noflush(pgd_t *pgd, u16 asid) #else #define __flush_tlb() __native_flush_tlb() #define __flush_tlb_global() __native_flush_tlb_global() -#define __flush_tlb_one_user(addr) __native_flush_tlb_one_user(addr) +#define __flush_tlb_range(start, end, stride_shift) __native_flush_tlb_range(start, end, stride_shift) #endif struct tlb_context { @@ -454,23 +454,13 @@ static inline void __native_flush_tlb_global(void) /* * flush one page in the user mapping */ -static inline void __native_flush_tlb_one_user(unsigned long addr) +static inline void __native_flush_tlb_range(unsigned long start, + unsigned long end, u8 stride_shift) { - u32 loaded_mm_asid = this_cpu_read(cpu_tlbstate.loaded_mm_asid); + unsigned long addr; - asm volatile("invlpg (%0)" ::"r" (addr) : "memory"); - - if (!static_cpu_has(X86_FEATURE_PTI)) - return; - - /* - * Some platforms #GP if we call invpcid(type=1/2) before CR4.PCIDE=1. - * Just use invalidate_user_asid() in case we are called early. - */ - if (!this_cpu_has(X86_FEATURE_INVPCID_SINGLE)) - invalidate_user_asid(loaded_mm_asid); - else - invpcid_flush_one(user_pcid(loaded_mm_asid), addr); + for (addr = start; addr < end; addr += 1ul << stride_shift) + asm volatile("invlpg (%0)" ::"r" (addr) : "memory"); } /* @@ -512,7 +502,7 @@ static inline void __flush_tlb_one_kernel(unsigned long addr) * kernel address space and for its usermode counterpart, but it does * not flush it for other address spaces. */ - __flush_tlb_one_user(addr); + __flush_tlb_range(addr, addr + PAGE_SIZE, PAGE_SHIFT); if (!static_cpu_has(X86_FEATURE_PTI)) return; diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c index 5520a04c84ba..195f5577d0d5 100644 --- a/arch/x86/kernel/paravirt.c +++ b/arch/x86/kernel/paravirt.c @@ -173,9 +173,10 @@ static void native_flush_tlb_global(void) __native_flush_tlb_global(); } -static void native_flush_tlb_one_user(unsigned long addr) +static void native_flush_tlb_range(unsigned long start, unsigned long end, + u8 stride_shift) { - __native_flush_tlb_one_user(addr); + __native_flush_tlb_range(start, end, stride_shift); } struct static_key paravirt_steal_enabled; @@ -358,7 +359,7 @@ struct paravirt_patch_template pv_ops = { /* Mmu ops. */ .mmu.flush_tlb_user = native_flush_tlb, .mmu.flush_tlb_kernel = native_flush_tlb_global, - .mmu.flush_tlb_one_user = native_flush_tlb_one_user, + .mmu.flush_tlb_range = native_flush_tlb_range, .mmu.flush_tlb_multi = native_flush_tlb_multi, .mmu.tlb_remove_table = (void (*)(struct mmu_gather *, void *))tlb_remove_page, diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 0addc6e84126..ad15fc2c0790 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -512,6 +512,35 @@ void initialize_tlbstate_and_flush(void) this_cpu_write(cpu_tlbstate.ctxs[i].ctx_id, 0); } +static void flush_tlb_user_pt_range(u16 asid, const struct flush_tlb_info *f) +{ + unsigned long start, end, addr; + u8 stride_shift; + + if (!static_cpu_has(X86_FEATURE_PTI)) + return; + + /* + * Read the data into variable on the stack to prevent it from being + * reevaluated due to invpcid memory clobber. + */ + start = f->start; + end = f->end; + stride_shift = f->stride_shift; + + /* + * Some platforms #GP if we call invpcid(type=1/2) before CR4.PCIDE=1. + * Just use invalidate_user_asid() in case we are called early. + */ + if (!this_cpu_has(X86_FEATURE_INVPCID_SINGLE)) { + invalidate_user_asid(asid); + return; + } + + for (addr = start; addr < end; addr += 1ul << stride_shift) + invpcid_flush_one(user_pcid(asid), addr); +} + /* * flush_tlb_func()'s memory ordering requirement is that any * TLB fills that happen after we flush the TLB are ordered after we @@ -624,14 +653,12 @@ static void flush_tlb_func(void *info) f->new_tlb_gen == local_tlb_gen + 1 && f->new_tlb_gen == mm_tlb_gen) { /* Partial flush */ - unsigned long addr = f->start; - nr_invalidate = (f->end - f->start) >> f->stride_shift; - while (addr < f->end) { - __flush_tlb_one_user(addr); - addr += 1UL << f->stride_shift; - } + __flush_tlb_range(f->start, f->end, f->stride_shift); + + flush_tlb_user_pt_range(loaded_mm_asid, f); + if (local) count_vm_tlb_events(NR_TLB_LOCAL_FLUSH_ONE, nr_invalidate); } else { diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c index 48f7c7eb4dbc..ed68657f5e77 100644 --- a/arch/x86/xen/mmu_pv.c +++ b/arch/x86/xen/mmu_pv.c @@ -1325,22 +1325,27 @@ static noinline void xen_flush_tlb(void) preempt_enable(); } -static void xen_flush_tlb_one_user(unsigned long addr) +static void xen_flush_tlb_range(unsigned long start, unsigned long end, + u8 stride_shift) { struct mmuext_op *op; struct multicall_space mcs; - - trace_xen_mmu_flush_tlb_one_user(addr); + unsigned long addr; preempt_disable(); mcs = xen_mc_entry(sizeof(*op)); op = mcs.args; - op->cmd = MMUEXT_INVLPG_LOCAL; - op->arg1.linear_addr = addr & PAGE_MASK; - MULTI_mmuext_op(mcs.mc, op, 1, NULL, DOMID_SELF); - xen_mc_issue(PARAVIRT_LAZY_MMU); + for (addr = start; addr < end; addr += 1ul << stride_shift) { + trace_xen_mmu_flush_tlb_one_user(addr); + + op->cmd = MMUEXT_INVLPG_LOCAL; + op->arg1.linear_addr = addr & PAGE_MASK; + MULTI_mmuext_op(mcs.mc, op, 1, NULL, DOMID_SELF); + + xen_mc_issue(PARAVIRT_LAZY_MMU); + } preempt_enable(); } @@ -2394,7 +2399,7 @@ static const struct pv_mmu_ops xen_mmu_ops __initconst = { .flush_tlb_user = xen_flush_tlb, .flush_tlb_kernel = xen_flush_tlb, - .flush_tlb_one_user = xen_flush_tlb_one_user, + .flush_tlb_range = xen_flush_tlb_range, .flush_tlb_multi = xen_flush_tlb_multi, .tlb_remove_table = tlb_remove_table,