From patchwork Mon Jul 1 11:20:33 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 11025637 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8463B746 for ; Mon, 1 Jul 2019 11:24:22 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 74F36283CA for ; Mon, 1 Jul 2019 11:24:22 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6854D2866C; Mon, 1 Jul 2019 11:24:22 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id B8D1C283CA for ; Mon, 1 Jul 2019 11:24:21 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1hhuOR-0001bP-Cf; Mon, 01 Jul 2019 11:22:43 +0000 Received: from all-amaz-eas1.inumbo.com ([34.197.232.57] helo=us1-amaz-eas2.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1hhuOP-0001b8-JA for xen-devel@lists.xenproject.org; Mon, 01 Jul 2019 11:22:41 +0000 X-Inumbo-ID: 87da00f6-9bf2-11e9-b8d1-3780b6f55a2b Received: from m4a0040g.houston.softwaregrp.com (unknown [15.124.2.86]) by us1-amaz-eas2.inumbo.com (Halon) with ESMTPS id 87da00f6-9bf2-11e9-b8d1-3780b6f55a2b; Mon, 01 Jul 2019 11:22:38 +0000 (UTC) Received: FROM m4a0040g.houston.softwaregrp.com (15.120.17.146) BY m4a0040g.houston.softwaregrp.com WITH ESMTP; Mon, 1 Jul 2019 11:22:19 +0000 Received: from M4W0334.microfocus.com (2002:f78:1192::f78:1192) by M4W0334.microfocus.com (2002:f78:1192::f78:1192) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1591.10; Mon, 1 Jul 2019 11:20:34 +0000 Received: from NAM04-CO1-obe.outbound.protection.outlook.com (15.124.8.10) by M4W0334.microfocus.com (15.120.17.146) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1591.10 via Frontend Transport; Mon, 1 Jul 2019 11:20:34 +0000 Received: from BY5PR18MB3394.namprd18.prod.outlook.com (10.255.139.95) by BY5PR18MB3363.namprd18.prod.outlook.com (10.255.139.24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2032.18; Mon, 1 Jul 2019 11:20:33 +0000 Received: from BY5PR18MB3394.namprd18.prod.outlook.com ([fe80::2005:4b02:1d60:d1bc]) by BY5PR18MB3394.namprd18.prod.outlook.com ([fe80::2005:4b02:1d60:d1bc%3]) with mapi id 15.20.2008.020; Mon, 1 Jul 2019 11:20:33 +0000 From: Jan Beulich To: "xen-devel@lists.xenproject.org" Thread-Topic: [PATCH v9 08/23] x86emul: support AVX512PF insns Thread-Index: AQHVL/7/6QHyQ/r2N0GJ2x6uY/K0IA== Date: Mon, 1 Jul 2019 11:20:33 +0000 Message-ID: <4365e23d-c2aa-dc10-46d0-df38d9c36322@suse.com> References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-clientproxiedby: LO2P265CA0283.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:a1::31) To BY5PR18MB3394.namprd18.prod.outlook.com (2603:10b6:a03:194::31) authentication-results: spf=none (sender IP is ) smtp.mailfrom=JBeulich@suse.com; x-ms-exchange-messagesentrepresentingtype: 1 x-originating-ip: [87.234.252.170] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: dd08e0ee-8f0e-4500-ae69-08d6fe1621ce x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390118)(7020095)(4652040)(8989299)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(5600148)(711020)(4605104)(1401327)(2017052603328)(7193020); SRVR:BY5PR18MB3363; x-ms-traffictypediagnostic: BY5PR18MB3363: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:530; x-forefront-prvs: 00851CA28B x-forefront-antispam-report: SFV:NSPM; SFS:(10019020)(4636009)(396003)(346002)(39860400002)(376002)(136003)(366004)(199004)(189003)(14444005)(25786009)(2616005)(5640700003)(6486002)(256004)(66066001)(6436002)(486006)(36756003)(72206003)(11346002)(476003)(446003)(2501003)(3846002)(6116002)(66946007)(386003)(76176011)(52116002)(81156014)(8676002)(14454004)(8936002)(71190400001)(80792005)(2906002)(81166006)(305945005)(99286004)(31696002)(7736002)(102836004)(26005)(186003)(6506007)(86362001)(54906003)(478600001)(4326008)(316002)(66476007)(5660300002)(64756008)(66446008)(68736007)(73956011)(6512007)(71200400001)(31686004)(6916009)(53936002)(66556008)(2351001)(473944003)(414714003); DIR:OUT; SFP:1102; SCL:1; SRVR:BY5PR18MB3363; H:BY5PR18MB3394.namprd18.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:1; MX:1; received-spf: None (protection.outlook.com: suse.com does not designate permitted sender hosts) x-ms-exchange-senderadcheck: 1 x-microsoft-antispam-message-info: imAzYaiczo4FtKMNLwpzIgAINwNlc9GV4xQySn7h0aobs7JfcBYkxRaWzOc9qi2Etz9dxWNUa+boZdKLuZbZntl8gBjUjmj7ORaLKXoY+fB2wO6ifa0CWgd2ua0PN6vayKTAyUVNRT4Q0KodbS3GNabE0S2vA5MuyD51G/hZyImd9fn19IDjoO1JzDqHqZU5fAs4c7idfEnvsVCZUfqkomaaUrYbAyMxY0deSo4oDiAiACLarcwXICr4yaEgkd/OEVll5jY2tiaV42tQ9XLNKtxWjZeRQJWSV3C1P2RYftOgAG11cvi938czMCEUCr5hgoCriJ6O/yNql7hVt9OEz64ArG6oRrFruwJVuJG/C7RaFEaC6glM9uc9OBmskqEpBtjWfnX1nju9CLJNwe7A28fecMAzC/OOdi3xdYKSghk= Content-ID: <5B04E33FEF4C8F4EA75948ECD75E01F4@namprd18.prod.outlook.com> MIME-Version: 1.0 X-MS-Exchange-CrossTenant-Network-Message-Id: dd08e0ee-8f0e-4500-ae69-08d6fe1621ce X-MS-Exchange-CrossTenant-originalarrivaltime: 01 Jul 2019 11:20:33.1741 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 856b813c-16e5-49a5-85ec-6f081e13b527 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: JBeulich@suse.com X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY5PR18MB3363 X-OriginatorOrg: suse.com Subject: [Xen-devel] [PATCH v9 08/23] x86emul: support AVX512PF insns X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Andrew Cooper , Wei Liu , RogerPau Monne Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP Some adjustments are necessary to the EVEX Disp8 scaling test code to account for the zero byte reads/writes, which get issued for the test harness only. Signed-off-by: Jan Beulich Acked-by: Andrew Cooper --- v9: Suppress general register update upon failures. Re-base. v8: #GP/#SS don't arise here. Add previously missed change to emul_test_init(). v7: Re-base. v6: New. --- a/tools/tests/x86_emulator/evex-disp8.c +++ b/tools/tests/x86_emulator/evex-disp8.c @@ -520,6 +520,17 @@ static const struct test avx512er_512[] INSN(rsqrt28, 66, 0f38, cd, el, sd, el), }; +static const struct test avx512pf_512[] = { + INSNX(gatherpf0d, 66, 0f38, c6, 1, vl, sd, el), + INSNX(gatherpf0q, 66, 0f38, c7, 1, vl, sd, el), + INSNX(gatherpf1d, 66, 0f38, c6, 2, vl, sd, el), + INSNX(gatherpf1q, 66, 0f38, c7, 2, vl, sd, el), + INSNX(scatterpf0d, 66, 0f38, c6, 5, vl, sd, el), + INSNX(scatterpf0q, 66, 0f38, c7, 5, vl, sd, el), + INSNX(scatterpf1d, 66, 0f38, c6, 6, vl, sd, el), + INSNX(scatterpf1q, 66, 0f38, c7, 6, vl, sd, el), +}; + static const struct test avx512_vbmi_all[] = { INSN(permb, 66, 0f38, 8d, vl, b, vl), INSN(permi2b, 66, 0f38, 75, vl, b, vl), @@ -580,7 +591,7 @@ static bool record_access(enum x86_segme static int read(enum x86_segment seg, unsigned long offset, void *p_data, unsigned int bytes, struct x86_emulate_ctxt *ctxt) { - if ( !record_access(seg, offset, bytes) ) + if ( !record_access(seg, offset, bytes + !bytes) ) return X86EMUL_UNHANDLEABLE; memset(p_data, 0, bytes); return X86EMUL_OKAY; @@ -589,7 +600,7 @@ static int read(enum x86_segment seg, un static int write(enum x86_segment seg, unsigned long offset, void *p_data, unsigned int bytes, struct x86_emulate_ctxt *ctxt) { - if ( !record_access(seg, offset, bytes) ) + if ( !record_access(seg, offset, bytes + !bytes) ) return X86EMUL_UNHANDLEABLE; return X86EMUL_OKAY; } @@ -597,7 +608,7 @@ static int write(enum x86_segment seg, u static void test_one(const struct test *test, enum vl vl, unsigned char *instr, struct x86_emulate_ctxt *ctxt) { - unsigned int vsz, esz, i; + unsigned int vsz, esz, i, n; int rc; bool sg = strstr(test->mnemonic, "gather") || strstr(test->mnemonic, "scatter"); @@ -725,10 +736,20 @@ static void test_one(const struct test * for ( i = 0; i < (test->scale == SC_vl ? vsz : esz); ++i ) if ( accessed[i] ) goto fail; - for ( ; i < (test->scale == SC_vl ? vsz : esz) + (sg ? esz : vsz); ++i ) + + n = test->scale == SC_vl ? vsz : esz; + if ( !sg ) + n += vsz; + else if ( !strstr(test->mnemonic, "pf") ) + n += esz; + else + ++n; + + for ( ; i < n; ++i ) if ( accessed[i] != (sg ? (vsz / esz) >> (test->opc & 1 & !evex.w) : 1) ) goto fail; + for ( ; i < ARRAY_SIZE(accessed); ++i ) if ( accessed[i] ) goto fail; @@ -887,6 +908,8 @@ void evex_disp8_test(void *instr, struct RUN(avx512dq, no128); RUN(avx512dq, 512); RUN(avx512er, 512); +#define cpu_has_avx512pf cpu_has_avx512f + RUN(avx512pf, 512); RUN(avx512_vbmi, all); RUN(avx512_vbmi2, all); } --- a/tools/tests/x86_emulator/x86-emulate.c +++ b/tools/tests/x86_emulator/x86-emulate.c @@ -73,6 +73,7 @@ bool emul_test_init(void) */ cp.basic.movbe = true; cp.feat.adx = true; + cp.feat.avx512pf = cp.feat.avx512f; cp.feat.rdpid = true; cp.extd.clzero = true; @@ -135,12 +136,14 @@ int emul_test_cpuid( res->c |= 1U << 22; /* - * The emulator doesn't itself use ADCX/ADOX/RDPID, so we can always run - * the respective tests. + * The emulator doesn't itself use ADCX/ADOX/RDPID nor the S/G prefetch + * insns, so we can always run the respective tests. */ if ( leaf == 7 && subleaf == 0 ) { res->b |= 1U << 19; + if ( res->b & (1U << 16) ) + res->b |= 1U << 26; res->c |= 1U << 22; } --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -525,6 +525,7 @@ static const struct ext0f38_table { [0xbd] = { .simd_size = simd_scalar_vexw, .d8s = d8s_dq }, [0xbe] = { .simd_size = simd_packed_fp, .d8s = d8s_vl }, [0xbf] = { .simd_size = simd_scalar_vexw, .d8s = d8s_dq }, + [0xc6 ... 0xc7] = { .simd_size = simd_other, .vsib = 1, .d8s = d8s_dq }, [0xc8] = { .simd_size = simd_packed_fp, .two_op = 1, .d8s = d8s_vl }, [0xc9] = { .simd_size = simd_other }, [0xca] = { .simd_size = simd_packed_fp, .two_op = 1, .d8s = d8s_vl }, @@ -1871,6 +1872,7 @@ in_protmode( #define vcpu_has_smap() (ctxt->cpuid->feat.smap) #define vcpu_has_clflushopt() (ctxt->cpuid->feat.clflushopt) #define vcpu_has_clwb() (ctxt->cpuid->feat.clwb) +#define vcpu_has_avx512pf() (ctxt->cpuid->feat.avx512pf) #define vcpu_has_avx512er() (ctxt->cpuid->feat.avx512er) #define vcpu_has_sha() (ctxt->cpuid->feat.sha) #define vcpu_has_avx512bw() (ctxt->cpuid->feat.avx512bw) @@ -9410,6 +9412,97 @@ x86_emulate( state->simd_size = simd_none; break; + } + + case X86EMUL_OPC_EVEX_66(0x0f38, 0xc6): + case X86EMUL_OPC_EVEX_66(0x0f38, 0xc7): + { +#ifndef __XEN__ + typeof(evex) *pevex; + union { + int32_t dw[16]; + int64_t qw[8]; + } index; +#endif + + ASSERT(ea.type == OP_MEM); + generate_exception_if((!cpu_has_avx512f || !evex.opmsk || evex.brs || + evex.z || evex.reg != 0xf || evex.lr != 2), + EXC_UD); + + switch ( modrm_reg & 7 ) + { + case 1: /* vgatherpf0{d,q}p{s,d} mem{k} */ + case 2: /* vgatherpf1{d,q}p{s,d} mem{k} */ + case 5: /* vscatterpf0{d,q}p{s,d} mem{k} */ + case 6: /* vscatterpf1{d,q}p{s,d} mem{k} */ + vcpu_must_have(avx512pf); + break; + default: + generate_exception(EXC_UD); + } + + get_fpu(X86EMUL_FPU_zmm); + +#ifndef __XEN__ + /* + * For the test harness perform zero byte memory accesses, such that + * in particular correct Disp8 scaling can be verified. + */ + fail_if((modrm_reg & 4) && !ops->write); + + /* Read index register. */ + opc = init_evex(stub); + pevex = copy_EVEX(opc, evex); + pevex->opcx = vex_0f; + /* vmovdqu{32,64} */ + opc[0] = 0x7f; + pevex->pfx = vex_f3; + pevex->w = b & 1; + /* Use (%rax) as destination and sib_index as source. */ + pevex->b = 1; + opc[1] = (state->sib_index & 7) << 3; + pevex->r = !mode_64bit() || !(state->sib_index & 0x08); + pevex->R = !mode_64bit() || !(state->sib_index & 0x10); + pevex->RX = 1; + opc[2] = 0xc3; + + invoke_stub("", "", "=m" (index) : "a" (&index)); + put_stub(stub); + + /* Clear untouched parts of the mask value. */ + n = 1 << (4 - ((b & 1) | evex.w)); + op_mask &= (1 << n) - 1; + + for ( i = 0; rc == X86EMUL_OKAY && op_mask; ++i ) + { + signed long idx = b & 1 ? index.qw[i] : index.dw[i]; + + if ( !(op_mask & (1 << i)) ) + continue; + + rc = (modrm_reg & 4 + ? ops->write + : ops->read)(ea.mem.seg, + truncate_ea(ea.mem.off + + (idx << state->sib_scale)), + NULL, 0, ctxt); + if ( rc == X86EMUL_EXCEPTION ) + { + /* Squash memory access related exceptions. */ + x86_emul_reset_event(ctxt); + rc = X86EMUL_OKAY; + } + + op_mask &= ~(1 << i); + } + + if ( rc != X86EMUL_OKAY ) + goto done; +#endif + + state->simd_size = simd_none; + break; } case X86EMUL_OPC(0x0f38, 0xc8): /* sha1nexte xmm/m128,xmm */