From patchwork Wed Feb 12 16:09:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wei Liu X-Patchwork-Id: 11378701 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 00A811805 for ; Wed, 12 Feb 2020 16:11:00 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C685D2082F for ; Wed, 12 Feb 2020 16:10:59 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="JFPn51+G" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C685D2082F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=xen.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1j1ua9-0003Rr-8o; Wed, 12 Feb 2020 16:09:45 +0000 Received: from us1-rack-iad1.inumbo.com ([172.99.69.81]) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1j1ua7-0003RC-DT for xen-devel@lists.xenproject.org; Wed, 12 Feb 2020 16:09:43 +0000 X-Inumbo-ID: 0bd48cac-4db2-11ea-ade5-bc764e2007e4 Received: from mail-wm1-x342.google.com (unknown [2a00:1450:4864:20::342]) by us1-rack-iad1.inumbo.com (Halon) with ESMTPS id 0bd48cac-4db2-11ea-ade5-bc764e2007e4; Wed, 12 Feb 2020 16:09:29 +0000 (UTC) Received: by mail-wm1-x342.google.com with SMTP id t14so3153163wmi.5 for ; Wed, 12 Feb 2020 08:09:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ydWvk0smCY6uoEuwo9BlmC3I8ThSoNpZz7RCbSXh68Q=; b=JFPn51+G9Ql7ZSdKBwjRobPeLBIIoUlx1M6QqIK92YAVdMkfF0CVUMKwruBM246W64 P/k8G3bJDUCPS3vA01fjJTH3SgJ47FpXSIQrvdinzME9nMKVTDBHs9iv24ul4DCR0qwG OwV3A9iZ7nAKpwlnKejpIx85IfpGLFw1Wxb/vG+KhcM0ra86ED1cC5lVa7NDTw598xf9 wZOtYgw2W5NbhD26RQ1RRc2yCn2z1R2vchGFi4PrGP0fGR9a18vUv404LDSdt+hZj4Vt cTf+w4A0HpFK8jruv5enKUXEgwoe8a05euowC293ZamsWb63Yturn6tm5JZDph7/oPm8 NrNQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references:mime-version:content-transfer-encoding; bh=ydWvk0smCY6uoEuwo9BlmC3I8ThSoNpZz7RCbSXh68Q=; b=IspWhbIa4Lr12IMV6Cgl3ORlkzLGTHmDO2R/vXgYGtgHc1LWqzuWdsJaQom3/SdJrD XzP2caweA32ArZndDD4pvnMke1PTOu2XeHPdYsePgXfTBqxdbbQW+a53b1r1UT800AgM f7RBbKmEGB9nVrk6GFQveldLbBhjbtcHQ26wvVyEpgTWsoNAJCL7thHV0IJz19l+LJ2K 7TXl9rWD7pE+C8RqbZPk8vWc1IM/6iFgBG17AJ0hNbsrHwkG7vA12VdIITn909+Nth8u MzSbKy2glcPTjptD7zPFVTuUkWDMxRFwVuAgIkHaQLQ893WrIGSZqGixQ2SU9Cin9ohj ttow== X-Gm-Message-State: APjAAAXrKSdzSZVJYGG8Ie9YmW4hPYg7TE2BGHeN7LWRSJNxcGxmuFMF Y57ywrG3ywRGxUTxU70NwdL6ws3L X-Google-Smtp-Source: APXvYqyzPBMXXPPLOqroBObFLj+zi2m4AJKiCOqpU4xeMM3zfmLKmc19nqOCXDdGEoe0Jtm2jViIOw== X-Received: by 2002:a1c:1f56:: with SMTP id f83mr13346342wmf.93.1581523766878; Wed, 12 Feb 2020 08:09:26 -0800 (PST) Received: from localhost.localdomain (41.142.6.51.dyn.plus.net. [51.6.142.41]) by smtp.gmail.com with ESMTPSA id o4sm1142500wrx.25.2020.02.12.08.09.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 12 Feb 2020 08:09:26 -0800 (PST) From: Wei Liu X-Google-Original-From: Wei Liu To: Xen Development List Date: Wed, 12 Feb 2020 16:09:18 +0000 Message-Id: <20200212160918.18470-5-liuwe@microsoft.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200212160918.18470-1-liuwe@microsoft.com> References: <20200212160918.18470-1-liuwe@microsoft.com> MIME-Version: 1.0 Subject: [Xen-devel] [PATCH 4/4] x86/hyperv: L0 assisted TLB flush X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Wei Liu , Wei Liu , Andrew Cooper , Paul Durrant , Michael Kelley , Jan Beulich , =?utf-8?q?Roger_Pau_Monn=C3=A9?= Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" Implement L0 assisted TLB flush for Xen on Hyper-V. It takes advantage of several hypercalls: * HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST * HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX * HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE * HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX Pick the most efficient hypercalls available. Signed-off-by: Wei Liu --- xen/arch/x86/guest/hyperv/Makefile | 1 + xen/arch/x86/guest/hyperv/private.h | 9 ++ xen/arch/x86/guest/hyperv/tlb.c | 172 +++++++++++++++++++++++++++- xen/arch/x86/guest/hyperv/util.c | 72 ++++++++++++ 4 files changed, 253 insertions(+), 1 deletion(-) create mode 100644 xen/arch/x86/guest/hyperv/util.c diff --git a/xen/arch/x86/guest/hyperv/Makefile b/xen/arch/x86/guest/hyperv/Makefile index 18902c33e9..0e39410968 100644 --- a/xen/arch/x86/guest/hyperv/Makefile +++ b/xen/arch/x86/guest/hyperv/Makefile @@ -1,2 +1,3 @@ obj-y += hyperv.o obj-y += tlb.o +obj-y += util.o diff --git a/xen/arch/x86/guest/hyperv/private.h b/xen/arch/x86/guest/hyperv/private.h index 78e52f74ce..311f060495 100644 --- a/xen/arch/x86/guest/hyperv/private.h +++ b/xen/arch/x86/guest/hyperv/private.h @@ -24,12 +24,21 @@ #include #include +#include DECLARE_PER_CPU(void *, hv_input_page); DECLARE_PER_CPU(void *, hv_vp_assist); DECLARE_PER_CPU(uint32_t, hv_vp_index); +static inline uint32_t hv_vp_index(int cpu) +{ + return per_cpu(hv_vp_index, cpu); +} + int hyperv_flush_tlb(const cpumask_t *mask, const void *va, unsigned int flags); +/* Returns number of banks, -ev if error */ +int cpumask_to_vpset(struct hv_vpset *vpset, const cpumask_t *mask); + #endif /* __XEN_HYPERV_PRIVIATE_H__ */ diff --git a/xen/arch/x86/guest/hyperv/tlb.c b/xen/arch/x86/guest/hyperv/tlb.c index 48f527229e..99b789d9e9 100644 --- a/xen/arch/x86/guest/hyperv/tlb.c +++ b/xen/arch/x86/guest/hyperv/tlb.c @@ -19,15 +19,185 @@ * Copyright (c) 2020 Microsoft. */ +#include #include #include +#include +#include +#include + #include "private.h" +/* + * It is possible to encode up to 4096 pages using the lower 12 bits + * in an element of gva_list + */ +#define HV_TLB_FLUSH_UNIT (4096 * PAGE_SIZE) +#define ORDER_TO_BYTES(order) ((1ul << (order)) * PAGE_SIZE) + +static unsigned int fill_gva_list(uint64_t *gva_list, const void *va, + unsigned int order) +{ + unsigned long start = (unsigned long)va; + unsigned long end = start + ORDER_TO_BYTES(order) - 1; + unsigned int n = 0; + + do { + unsigned long remain = end > start ? end - start : 0; + + gva_list[n] = start & PAGE_MASK; + + /* + * Use lower 12 bits to encode the number of additional pages + * to flush + */ + if ( remain >= HV_TLB_FLUSH_UNIT ) + { + gva_list[n] |= ~PAGE_MASK; + start += HV_TLB_FLUSH_UNIT; + } + else if ( remain ) + { + gva_list[n] |= (remain - 1) >> PAGE_SHIFT; + start = end; + } + + n++; + } while ( start < end ); + + return n; +} + +static uint64_t flush_tlb_ex(const cpumask_t *mask, const void *va, + unsigned int flags) +{ + struct hv_tlb_flush_ex *flush = this_cpu(hv_input_page); + int nr_banks; + unsigned int max_gvas; + unsigned int order = flags & FLUSH_ORDER_MASK; + uint64_t ret; + + ASSERT(flush); + ASSERT(!local_irq_is_enabled()); + + if ( !(ms_hyperv.hints & HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED) ) + return ~0ULL; + + flush->address_space = 0; + flush->flags = HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES; + if ( !(flags & FLUSH_TLB_GLOBAL) ) + flush->flags |= HV_FLUSH_NON_GLOBAL_MAPPINGS_ONLY; + + flush->hv_vp_set.valid_bank_mask = 0; + flush->hv_vp_set.format = HV_GENERIC_SET_SPARSE_4K; + + nr_banks = cpumask_to_vpset(&flush->hv_vp_set, mask); + if ( nr_banks < 0 ) + return ~0ULL; + + max_gvas = + (PAGE_SIZE - sizeof(*flush) - nr_banks * + sizeof(flush->hv_vp_set.bank_contents[0])) / + sizeof(uint64_t); /* gva is represented as uint64_t */ + + /* + * Flush the entire address space if va is NULL or if there is not + * enough space for gva_list. + */ + if ( !va || (ORDER_TO_BYTES(order) / HV_TLB_FLUSH_UNIT) > max_gvas ) + ret = hv_do_rep_hypercall(HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX, 0, + nr_banks, virt_to_maddr(flush), 0); + else + { + uint64_t *gva_list = (uint64_t *)flush + sizeof(*flush) + nr_banks; + unsigned int gvas = fill_gva_list(gva_list, va, order); + + ret = hv_do_rep_hypercall(HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX, + gvas, nr_banks, virt_to_maddr(flush), 0); + } + + return ret; +} + int hyperv_flush_tlb(const cpumask_t *mask, const void *va, unsigned int flags) { - return -EOPNOTSUPP; + unsigned long irq_flags; + struct hv_tlb_flush *flush = this_cpu(hv_input_page); + uint64_t ret; + unsigned int order = flags & FLUSH_ORDER_MASK; + unsigned int max_gvas; + + ASSERT(flush); + ASSERT(!cpumask_empty(mask)); + + local_irq_save(irq_flags); + + flush->address_space = 0; + flush->flags = HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES; + flush->processor_mask = 0; + if ( !(flags & FLUSH_TLB_GLOBAL) ) + flush->flags |= HV_FLUSH_NON_GLOBAL_MAPPINGS_ONLY; + + if ( cpumask_equal(mask, &cpu_online_map) ) + flush->flags |= HV_FLUSH_ALL_PROCESSORS; + else + { + int cpu; + + /* + * Normally VP indices are in ascending order and match Xen's + * idea of CPU ids. Check the last index to see if VP index is + * >= 64. If so, we can skip setting up parameters for + * non-applicable hypercalls without looking further. + */ + if ( hv_vp_index(cpumask_last(mask)) >= 64 ) + goto do_ex_hypercall; + + for_each_cpu ( cpu, mask ) + { + uint32_t vpid = hv_vp_index(cpu); + + if ( vpid > ms_hyperv.max_vp_index ) + { + local_irq_restore(irq_flags); + return -ENXIO; + } + + if ( vpid >= 64 ) + goto do_ex_hypercall; + + __set_bit(vpid, &flush->processor_mask); + } + } + + max_gvas = (PAGE_SIZE - sizeof(*flush)) / sizeof(flush->gva_list[0]); + + /* + * Flush the entire address space if va is NULL or if there is not + * enough space for gva_list. + */ + if ( !va || (ORDER_TO_BYTES(order) / HV_TLB_FLUSH_UNIT) > max_gvas ) + ret = hv_do_hypercall(HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE, + virt_to_maddr(flush), 0); + else + { + unsigned int gvas = fill_gva_list(flush->gva_list, va, order); + + ret = hv_do_rep_hypercall(HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST, gvas, 0, + virt_to_maddr(flush), 0); + } + + goto done; + + do_ex_hypercall: + ret = flush_tlb_ex(mask, va, flags); + + done: + local_irq_restore(irq_flags); + + return ret & HV_HYPERCALL_RESULT_MASK; } /* diff --git a/xen/arch/x86/guest/hyperv/util.c b/xen/arch/x86/guest/hyperv/util.c new file mode 100644 index 0000000000..9d0b5f4a46 --- /dev/null +++ b/xen/arch/x86/guest/hyperv/util.c @@ -0,0 +1,72 @@ +/****************************************************************************** + * arch/x86/guest/hyperv/util.c + * + * Hyper-V utility functions + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; If not, see . + * + * Copyright (c) 2020 Microsoft. + */ + +#include +#include +#include + +#include +#include + +#include "private.h" + +int cpumask_to_vpset(struct hv_vpset *vpset, + const cpumask_t *mask) +{ + int nr = 1, cpu, vcpu_bank, vcpu_offset; + int max_banks = ms_hyperv.max_vp_index / 64; + + /* Up to 64 banks can be represented by valid_bank_mask */ + if ( max_banks >= 64 ) + return -1; + + /* Clear all banks to avoid flushing unwanted CPUs */ + for ( vcpu_bank = 0; vcpu_bank <= max_banks; vcpu_bank++ ) + vpset->bank_contents[vcpu_bank] = 0; + + vpset->valid_bank_mask = 0; + + for_each_cpu ( cpu, mask ) + { + int vcpu = hv_vp_index(cpu); + + vcpu_bank = vcpu / 64; + vcpu_offset = vcpu % 64; + + __set_bit(vcpu_offset, &vpset->bank_contents[vcpu_bank]); + __set_bit(vcpu_bank, &vpset->valid_bank_mask); + + if ( vcpu_bank >= nr ) + nr = vcpu_bank + 1; + } + + return nr; +} + +/* + * Local variables: + * mode: C + * c-file-style: "BSD" + * c-basic-offset: 4 + * tab-width: 4 + * indent-tabs-mode: nil + * End: + */