From patchwork Mon Feb 27 21:05:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Chang S. Bae" X-Patchwork-Id: 13154200 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5216EC64ED6 for ; Mon, 27 Feb 2023 21:16:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229703AbjB0VQr (ORCPT ); Mon, 27 Feb 2023 16:16:47 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53570 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229787AbjB0VQq (ORCPT ); Mon, 27 Feb 2023 16:16:46 -0500 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4317326CE9; Mon, 27 Feb 2023 13:16:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1677532605; x=1709068605; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=sclNcLaMWRVL+dtt0esI+nvc2/fJ8Evi5Feh00cpz/4=; b=f4D8URr9BRG6Eq97bUhHjkDF5EQcDwRevt10hB689w6CCwo5N+HjfGro wdVCvUaZ7RJCD792ncr/8xgL2yFx2RKxJ4o9YTcmvH4PyJd1ecubdL+82 cbJaPqkvXgHe8Fg7+PiEaO1tDOxp2/PUSNvI+lg2Ra3LrTW4hMn5v/xBU J4J0BDleUUQjtA94NT2z30umugBe7X/FonZSEmRnaK7k/T/6SHcp2PVTj Z6WruuVMt8KXgQZ4SQtiOvTwEjXgwVZXGbKRu9nvs9CpWPaQMC70Qt7dd eDPmmH17Btn1wkZ8AaIA1dQNqEIHkNzXH+G6mSI1WHRL0Vr9rHNK054vQ g==; X-IronPort-AV: E=McAfee;i="6500,9779,10634"; a="322216373" X-IronPort-AV: E=Sophos;i="5.98,220,1673942400"; d="scan'208";a="322216373" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Feb 2023 13:16:42 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10634"; a="651372910" X-IronPort-AV: E=Sophos;i="5.98,220,1673942400"; d="scan'208";a="651372910" Received: from chang-linux-3.sc.intel.com ([172.25.66.173]) by orsmga006.jf.intel.com with ESMTP; 27 Feb 2023 13:16:42 -0800 From: "Chang S. Bae" To: linux-kernel@vger.kernel.org Cc: x86@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, dave.hansen@intel.com, hpa@zytor.com, linux-kselftest@vger.kernel.org, mizhang@google.com, chang.seok.bae@intel.com Subject: [PATCH 1/2] x86/fpu/xstate: Prevent false-positive warning in __copy_xstate_uabi_buf() Date: Mon, 27 Feb 2023 13:05:03 -0800 Message-Id: <20230227210504.18520-2-chang.seok.bae@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230227210504.18520-1-chang.seok.bae@intel.com> References: <20221018221349.4196-1-chang.seok.bae@intel.com> <20230227210504.18520-1-chang.seok.bae@intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org __copy_xstate_to_uabi_buf() copies either from the tasks XSAVE buffer or from init_fpstate into the ptrace buffer. Dynamic features, like XTILEDATA, have an all zeroes init state and are not saved in init_fpstate, which means the corresponding bit is not set in the xfeatures bitmap of the init_fpstate header. But __copy_xstate_to_uabi_buf() retrieves addresses for both the tasks xstate and init_fpstate unconditionally via __raw_xsave_addr(). So if the tasks XSAVE buffer has a dynamic feature set, then the address retrieval for init_fpstate triggers the warning in __raw_xsave_addr() which checks the feature bit in the init_fpstate header. Remove the address retrieval from init_fpstate for extended features. They have an all zeroes init state so init_fpstate has zeros for them. Then zeroing the user buffer for the init state is the same as copying them from init_fpstate. Fixes: 2308ee57d93d ("x86/fpu/amx: Enable the AMX feature in 64-bit mode") Reported-by: Mingwei Zhang Signed-off-by: Chang S. Bae Tested-by: Mingwei Zhang Cc: x86@kernel.org Cc: linux-kernel@vger.kernel.org Link: https://lore.kernel.org/kvm/20230221163655.920289-2-mizhang@google.com/ --- Thanks, Mingwei for detecting the issue and Thomas for explaining the problem with the well-written description. The background and problem sections in this changelog came from his write-up. I acknowledge that Mingwei's patch (in the link above) can solve the problem. But, I would rather propose this approach as it simplifies the code which is considered good in this sensitive copy function. This mostly zeroed init_fpstate is still useful, e.g., when copying the init state between buffers with the same format -- like in fpu_clone(). But, this case between different XSAVE formats looks a bit different than that. --- arch/x86/kernel/fpu/xstate.c | 30 ++++++++++++++---------------- 1 file changed, 14 insertions(+), 16 deletions(-) diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index 714166cc25f2..0bab497c9436 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -1118,21 +1118,20 @@ void __copy_xstate_to_uabi_buf(struct membuf to, struct fpstate *fpstate, zerofrom = offsetof(struct xregs_state, extended_state_area); /* - * The ptrace buffer is in non-compacted XSAVE format. In - * non-compacted format disabled features still occupy state space, - * but there is no state to copy from in the compacted - * init_fpstate. The gap tracking will zero these states. - */ - mask = fpstate->user_xfeatures; - - /* - * Dynamic features are not present in init_fpstate. When they are - * in an all zeros init state, remove those from 'mask' to zero - * those features in the user buffer instead of retrieving them - * from init_fpstate. + * This 'mask' indicates which states to copy from fpstate. + * Those extended states that are not present in fpstate are + * either disabled or initialized: + * + * In non-compacted format, disabled features still occupy + * state space but there is no state to copy from in the + * compacted init_fpstate. The gap tracking will zero these + * states. + * + * The extended features have an all zeroes init state. Thus, + * remove them from 'mask' to zero those features in the user + * buffer instead of retrieving them from init_fpstate. */ - if (fpu_state_size_dynamic()) - mask &= (header.xfeatures | xinit->header.xcomp_bv); + mask = header.xfeatures; for_each_extended_xfeature(i, mask) { /* @@ -1151,9 +1150,8 @@ void __copy_xstate_to_uabi_buf(struct membuf to, struct fpstate *fpstate, pkru.pkru = pkru_val; membuf_write(&to, &pkru, sizeof(pkru)); } else { - copy_feature(header.xfeatures & BIT_ULL(i), &to, + membuf_write(&to, __raw_xsave_addr(xsave, i), - __raw_xsave_addr(xinit, i), xstate_sizes[i]); } /* From patchwork Mon Feb 27 21:05:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Chang S. Bae" X-Patchwork-Id: 13154201 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EEF7DC64ED9 for ; Mon, 27 Feb 2023 21:16:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229515AbjB0VQs (ORCPT ); Mon, 27 Feb 2023 16:16:48 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53564 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229803AbjB0VQq (ORCPT ); Mon, 27 Feb 2023 16:16:46 -0500 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A190826CD8; Mon, 27 Feb 2023 13:16:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1677532605; x=1709068605; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=ELF9ow2F2OB1PVfHYSLDbv3bTw6H8vE2SYqZsSwjkOI=; b=Av7DlEKWHs9UKsugYPSjxZDflSYd9qlDx5h/hxLKXiO7XbwWT8Bd9YSf Uyehc3+dpFBlx7drCVG1/raygCDOV4SybGpFToWFpPOkCxq5G0DwBdLWM rkqh5SkeUTG4NIEhk8dVTMw1YcXANMlkKVHR4wZdSSo6z1tAOj7uaBDgs 2yEFIgZtliQd4AXImCmjoTdYN+X+W+RIyGiUAWEREtrmeoQqaWhRAEOPP u7sPAv/vxBcc8HQX1v6ZR8S8K/1g7TtKfDUhaQF02OM3lolu2/sw4kJi3 QcTd/hu3zck4Nrtsoy96xG9YuKD8/QckY7I46QBOzevTisGTxkYRPjiB5 g==; X-IronPort-AV: E=McAfee;i="6500,9779,10634"; a="322216377" X-IronPort-AV: E=Sophos;i="5.98,220,1673942400"; d="scan'208";a="322216377" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Feb 2023 13:16:42 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10634"; a="651372913" X-IronPort-AV: E=Sophos;i="5.98,220,1673942400"; d="scan'208";a="651372913" Received: from chang-linux-3.sc.intel.com ([172.25.66.173]) by orsmga006.jf.intel.com with ESMTP; 27 Feb 2023 13:16:42 -0800 From: "Chang S. Bae" To: linux-kernel@vger.kernel.org Cc: x86@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, dave.hansen@intel.com, hpa@zytor.com, linux-kselftest@vger.kernel.org, mizhang@google.com, chang.seok.bae@intel.com Subject: [PATCH 2/2] selftests/x86/amx: Add a ptrace test Date: Mon, 27 Feb 2023 13:05:04 -0800 Message-Id: <20230227210504.18520-3-chang.seok.bae@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230227210504.18520-1-chang.seok.bae@intel.com> References: <20221018221349.4196-1-chang.seok.bae@intel.com> <20230227210504.18520-1-chang.seok.bae@intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org Include a test case to validate the XTILEDATA injection to the target. Also, it ensures the kernel's ability to copy states between different XSAVE formats. Refactor the memcmp() code to be usable for the state validation. Signed-off-by: Chang S. Bae Cc: x86@kernel.org Cc: linux-kselftest@vger.kernel.org Cc: linux-kernel@vger.kernel.org --- tools/testing/selftests/x86/amx.c | 108 +++++++++++++++++++++++++++++- 1 file changed, 105 insertions(+), 3 deletions(-) diff --git a/tools/testing/selftests/x86/amx.c b/tools/testing/selftests/x86/amx.c index 625e42901237..d884fd69dd51 100644 --- a/tools/testing/selftests/x86/amx.c +++ b/tools/testing/selftests/x86/amx.c @@ -14,8 +14,10 @@ #include #include #include +#include #include #include +#include #include "../kselftest.h" /* For __cpuid_count() */ @@ -583,6 +585,13 @@ static void test_dynamic_state(void) _exit(0); } +static inline int __compare_tiledata_state(struct xsave_buffer *xbuf1, struct xsave_buffer *xbuf2) +{ + return memcmp(&xbuf1->bytes[xtiledata.xbuf_offset], + &xbuf2->bytes[xtiledata.xbuf_offset], + xtiledata.size); +} + /* * Save current register state and compare it to @xbuf1.' * @@ -599,9 +608,7 @@ static inline bool __validate_tiledata_regs(struct xsave_buffer *xbuf1) fatal_error("failed to allocate XSAVE buffer\n"); xsave(xbuf2, XFEATURE_MASK_XTILEDATA); - ret = memcmp(&xbuf1->bytes[xtiledata.xbuf_offset], - &xbuf2->bytes[xtiledata.xbuf_offset], - xtiledata.size); + ret = __compare_tiledata_state(xbuf1, xbuf2); free(xbuf2); @@ -826,6 +833,99 @@ static void test_context_switch(void) free(finfo); } +/* Ptrace test */ + +/* + * Make sure the ptracee has the expanded kernel buffer on the first + * use. Then, initialize the state before performing the state + * injection from the ptracer. + */ +static inline void ptracee_firstuse_tiledata(void) +{ + load_rand_tiledata(stashed_xsave); + init_xtiledata(); +} + +/* + * Ptracer injects the randomized tile data state. It also reads + * before and after that, which will execute the kernel's state copy + * functions. So, the tester is advised to double-check any emitted + * kernel messages. + */ +static void ptracer_inject_tiledata(pid_t target) +{ + struct xsave_buffer *xbuf; + struct iovec iov; + + xbuf = alloc_xbuf(); + if (!xbuf) + fatal_error("unable to allocate XSAVE buffer"); + + printf("\tRead the init'ed tiledata via ptrace().\n"); + + iov.iov_base = xbuf; + iov.iov_len = xbuf_size; + + memset(stashed_xsave, 0, xbuf_size); + + if (ptrace(PTRACE_GETREGSET, target, (uint32_t)NT_X86_XSTATE, &iov)) + fatal_error("PTRACE_GETREGSET"); + + if (!__compare_tiledata_state(stashed_xsave, xbuf)) + printf("[OK]\tThe init'ed tiledata was read from ptracee.\n"); + else + printf("[FAIL]\tThe init'ed tiledata was not read from ptracee.\n"); + + printf("\tInject tiledata via ptrace().\n"); + + load_rand_tiledata(xbuf); + + memcpy(&stashed_xsave->bytes[xtiledata.xbuf_offset], + &xbuf->bytes[xtiledata.xbuf_offset], + xtiledata.size); + + if (ptrace(PTRACE_SETREGSET, target, (uint32_t)NT_X86_XSTATE, &iov)) + fatal_error("PTRACE_SETREGSET"); + + if (ptrace(PTRACE_GETREGSET, target, (uint32_t)NT_X86_XSTATE, &iov)) + fatal_error("PTRACE_GETREGSET"); + + if (!__compare_tiledata_state(stashed_xsave, xbuf)) + printf("[OK]\tTiledata was correctly written to ptracee.\n"); + else + printf("[FAIL]\tTiledata was not correctly written to ptracee.\n"); +} + +static void test_ptrace(void) +{ + pid_t child; + int status; + + child = fork(); + if (child < 0) { + err(1, "fork"); + } else if (!child) { + if (ptrace(PTRACE_TRACEME, 0, NULL, NULL)) + err(1, "PTRACE_TRACEME"); + + ptracee_firstuse_tiledata(); + + raise(SIGTRAP); + _exit(0); + } + + do { + wait(&status); + } while (WSTOPSIG(status) != SIGTRAP); + + ptracer_inject_tiledata(child); + + ptrace(PTRACE_DETACH, child, NULL, NULL); + wait(&status); + if (!WIFEXITED(status) || WEXITSTATUS(status)) + err(1, "ptrace test"); +} + int main(void) { /* Check hardware availability at first */ @@ -846,6 +946,8 @@ int main(void) ctxtswtest_config.num_threads = 5; test_context_switch(); + test_ptrace(); + clearhandler(SIGILL); free_stashed_xsave();