From patchwork Mon Nov 25 14:27:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 13885018 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B2E59D58D50 for ; Mon, 25 Nov 2024 14:28:01 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.842620.1258293 (Exim 4.92) (envelope-from ) id 1tFa45-0001uO-C3; Mon, 25 Nov 2024 14:27:49 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 842620.1258293; Mon, 25 Nov 2024 14:27:49 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tFa45-0001uH-8M; Mon, 25 Nov 2024 14:27:49 +0000 Received: by outflank-mailman (input) for mailman id 842620; Mon, 25 Nov 2024 14:27:48 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tFa44-0001uB-DI for xen-devel@lists.xenproject.org; Mon, 25 Nov 2024 14:27:48 +0000 Received: from mail-wr1-x436.google.com (mail-wr1-x436.google.com [2a00:1450:4864:20::436]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 6fc3251b-ab39-11ef-99a3-01e77a169b0f; Mon, 25 Nov 2024 15:27:45 +0100 (CET) Received: by mail-wr1-x436.google.com with SMTP id ffacd0b85a97d-382296631f1so3581378f8f.3 for ; Mon, 25 Nov 2024 06:27:45 -0800 (PST) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-4349ef99e7dsm45167965e9.2.2024.11.25.06.27.43 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 25 Nov 2024 06:27:44 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 6fc3251b-ab39-11ef-99a3-01e77a169b0f X-Custom-Connection: eyJyZW1vdGVpcCI6IjJhMDA6MTQ1MDo0ODY0OjIwOjo0MzYiLCJoZWxvIjoibWFpbC13cjEteDQzNi5nb29nbGUuY29tIn0= X-Custom-Transaction: eyJpZCI6IjZmYzMyNTFiLWFiMzktMTFlZi05OWEzLTAxZTc3YTE2OWIwZiIsInRzIjoxNzMyNTQ0ODY1LjAyNzUxNywic2VuZGVyIjoiamJldWxpY2hAc3VzZS5jb20iLCJyZWNpcGllbnQiOiJ4ZW4tZGV2ZWxAbGlzdHMueGVucHJvamVjdC5vcmcifQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1732544864; x=1733149664; darn=lists.xenproject.org; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=NRReWrcJj5Da4Nc3bL9CexPuuLrnGBxMVpXbGbnV/ww=; b=TLh8nSy0YZK9xc2iIT7vmX/J4LPc+X2OPTIjvMtb8KZNaIHyPnOebwbGwZd4NMkDRo hliHPB5Z3yDkvpFkxBjeJf8KkTwQ4FfwPMDPQkEGbzT1D+NZTHQ6+NRsPg//dl+eosJc 3ks3Lo9x57A4LJlJ9QJeQAZ7NNhlMyTwegG076Hn27965E7Wwt1vtS4bNH0fnK0EeAhq Y+IIG7LA+rcMW64tLcmuMl40Y+MysKfuWRLbFA3XU+FOJn9a5Jr5/RrzQR21KN1cBjTr Z22S9xra9xq+iuMCp1qH4Ak1mds9ITDS7RpqGmkOYaN6zFoNLg87mQvwuwQAsE4paAzt p6+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732544864; x=1733149664; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=NRReWrcJj5Da4Nc3bL9CexPuuLrnGBxMVpXbGbnV/ww=; b=elFb8X5/xPFKwCwB9JKY5XE0wVAqf8QJJYn8BDVqjeSYeICA+AJi8yqNWX3B3aqIwo m33j242kTPE0yvSGq8uP9eVN6gA51tQASOJ6S0Eks1s3UjOO4Aipn8iZkI/M0UQrXMgu TcGDDBnd6u8ehk1peC/6gh3Bi8pzxTREUx/I7hgGkUSs3Fi8If0CsjOYj1geE0RTfIdg +c8wOU+cJGJAbfxlbmEx03ID6CcKIKIpk0L6cqK3VSP0W+8t12aeuW4GKQe1UksprsDq OMXnz3tyCpmoXQFfhGrPPdRFWjRONNrWoM9sTW2zFHj/JwgjcDstePQqdnqQXVZTe+sU VxxA== X-Gm-Message-State: AOJu0YyCtuTXMedya1ztvRQvT0V4+1WXKvMCGqkEP4wv6rqCt3EKBJRy wFmQ/9Vxyb9udtUXAq6CC8ncu7DN8s3t94OWMu2Exb7URKT0W++KQ58gfO6qsvla1p+eD5HUlfk = X-Gm-Gg: ASbGncsCvoHS4wOrm/kMazwj5+TNuNeLMFc4ZHqZBaH2BRiDAhI8yD3/RG7pUQlieaS Yy4z7D8wOYdfRW8hCYi9NBQxcPNicef5pHSPBpoBOTpbQzxuvF/gCfNTRGRLAcKRZYTR6WQ6Sht OhwKGG7raX31mmRzZUWYFGTks1flMdVLs0+8lyfkTTz4kBnupi36RxlZotv3fRbKnPYsOgQHPMy 5pApfzNTQAJGZF6rJEbgDZlEt8B8SpHduvRiek1o07OBJ5zPAtF9K+H+hT6Yb2OjyAegFvcq3ma At96ZMtEq2ldpDDE5vgpmefoiSvbFs7NiRY= X-Google-Smtp-Source: AGHT+IE6gt4wT+roPSloQYVmoFkwiASYMWeOV0KJOla1Cq94LDvTW82s8b6OaaAN+L3KINOG478PmQ== X-Received: by 2002:a05:6000:1a85:b0:382:4f34:ef7f with SMTP id ffacd0b85a97d-38260b8070amr10286575f8f.31.1732544864395; Mon, 25 Nov 2024 06:27:44 -0800 (PST) Message-ID: <66aa8b0c-c811-483b-839e-49ca817a4672@suse.com> Date: Mon, 25 Nov 2024 15:27:43 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: [PATCH v3 1/7] x86: suppress ERMS for internal use when MISC_ENABLE.FAST_STRING is clear From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , =?utf-8?q?Roger_Pau_Monn?= =?utf-8?q?=C3=A9?= References: Content-Language: en-US Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: Before we start actually adjusting behavior when ERMS is available, follow Linux commit 161ec53c702c ("x86, mem, intel: Initialize Enhanced REP MOVSB/STOSB") and zap the CPUID-derived feature flag when the MSR bit is clear. Don't extend the artificial clearing to guest view, though: Guests can take their own decision in this regard, as they can read (most of) MISC_ENABLE. Signed-off-by: Jan Beulich --- TBD: Would be nice if "cpuid=no-erms" propagated to guest view (for "cpuid=" generally meaning to affect guests as well as Xen), but since both disabling paths use setup_clear_cpu_cap() they're indistinguishable in guest_common_feature_adjustments(). A separate boolean could take care of this, but would look clumsy to me. --- v3: New. --- a/xen/arch/x86/cpu/intel.c +++ b/xen/arch/x86/cpu/intel.c @@ -337,8 +337,18 @@ static void cf_check early_init_intel(st paddr_bits = 36; if (c == &boot_cpu_data) { + uint64_t misc_enable; + check_memory_type_self_snoop_errata(); + /* + * If fast string is not enabled in IA32_MISC_ENABLE for any reason, + * clear the enhanced fast string CPU capability. + */ + rdmsrl(MSR_IA32_MISC_ENABLE, misc_enable); + if (!(misc_enable & MSR_IA32_MISC_ENABLE_FAST_STRING)) + setup_clear_cpu_cap(X86_FEATURE_ERMS); + intel_init_levelling(); } --- a/xen/arch/x86/cpu-policy.c +++ b/xen/arch/x86/cpu-policy.c @@ -590,6 +590,15 @@ static void __init guest_common_feature_ */ if ( host_cpu_policy.feat.ibrsb ) __set_bit(X86_FEATURE_IBPB, fs); + + /* + * We expose MISC_ENABLE to guests, so our internal clearing of ERMS when + * FAST_STRING is not set should not propagate to guest view. Guests can + * judge on their own whether to ignore the CPUID bit when the MSR bit is + * clear. + */ + if ( raw_cpu_policy.feat.erms ) + __set_bit(X86_FEATURE_ERMS, fs); } static void __init calculate_pv_max_policy(void) --- a/xen/arch/x86/include/asm/msr-index.h +++ b/xen/arch/x86/include/asm/msr-index.h @@ -489,6 +489,7 @@ #define MSR_IA32_THERM_INTERRUPT 0x0000019b #define MSR_IA32_THERM_STATUS 0x0000019c #define MSR_IA32_MISC_ENABLE 0x000001a0 +#define MSR_IA32_MISC_ENABLE_FAST_STRING (1<<0) #define MSR_IA32_MISC_ENABLE_PERF_AVAIL (1<<7) #define MSR_IA32_MISC_ENABLE_BTS_UNAVAIL (1<<11) #define MSR_IA32_MISC_ENABLE_PEBS_UNAVAIL (1<<12) From patchwork Mon Nov 25 14:28:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 13885019 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 93C9DD58D50 for ; Mon, 25 Nov 2024 14:28:20 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.842625.1258302 (Exim 4.92) (envelope-from ) id 1tFa4Q-0002Ig-If; Mon, 25 Nov 2024 14:28:10 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 842625.1258302; Mon, 25 Nov 2024 14:28:10 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tFa4Q-0002IX-Fv; Mon, 25 Nov 2024 14:28:10 +0000 Received: by outflank-mailman (input) for mailman id 842625; Mon, 25 Nov 2024 14:28:09 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tFa4P-0001uB-D7 for xen-devel@lists.xenproject.org; Mon, 25 Nov 2024 14:28:09 +0000 Received: from mail-wr1-x431.google.com (mail-wr1-x431.google.com [2a00:1450:4864:20::431]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 7c3d527f-ab39-11ef-99a3-01e77a169b0f; Mon, 25 Nov 2024 15:28:05 +0100 (CET) Received: by mail-wr1-x431.google.com with SMTP id ffacd0b85a97d-3823cae4be1so2786470f8f.3 for ; Mon, 25 Nov 2024 06:28:05 -0800 (PST) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-433b464320csm193000285e9.38.2024.11.25.06.28.04 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 25 Nov 2024 06:28:05 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 7c3d527f-ab39-11ef-99a3-01e77a169b0f X-Custom-Connection: eyJyZW1vdGVpcCI6IjJhMDA6MTQ1MDo0ODY0OjIwOjo0MzEiLCJoZWxvIjoibWFpbC13cjEteDQzMS5nb29nbGUuY29tIn0= X-Custom-Transaction: eyJpZCI6IjdjM2Q1MjdmLWFiMzktMTFlZi05OWEzLTAxZTc3YTE2OWIwZiIsInRzIjoxNzMyNTQ0ODg1Ljk2OTY3OSwic2VuZGVyIjoiamJldWxpY2hAc3VzZS5jb20iLCJyZWNpcGllbnQiOiJ4ZW4tZGV2ZWxAbGlzdHMueGVucHJvamVjdC5vcmcifQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1732544885; x=1733149685; darn=lists.xenproject.org; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=rBm6En5vnfhn1uyQpWAd/QeNo0+rMT72pPFEcsiUxNw=; b=UOrB6dBXraaXtjHSACHcDR+eNcvF2cSlwv/YU6Hn9qoW9F9ILXaRA+df0DwIshKR0l lGL2VE9JucjsvWAB/vB+o1D8SHItpLCs4vPHcF1XittvU+Dctyami4nC8Ha7x8r7j0nS JyBIwuF2h8o6QyrFYp4k7uwRmGLvHkEQ8uj8jA/RBTygBSkC/jrlhGBsAKgf4jmW8RZ1 sFSVTNjiv6KqydzkcvO3O6xKfRYyDBgoP3bruEqUGuauqa1YdqfKXo4ICEFuMrTGlqSz j3liXD/IasoiNlbxeZBNLDThGpkCtPkO/65yDxwxnf0zeBoBdUBtmOMkqV4NpQ2sHpS4 3SRg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732544885; x=1733149685; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=rBm6En5vnfhn1uyQpWAd/QeNo0+rMT72pPFEcsiUxNw=; b=rRt9tSPq9XkieKPxlf7dhToQzMotxxhLhVinUWdrHTaXXKHCRr2GRUVf0OU6AOWB0t ab82tPk1F1ne0brkxRNT5EqL4p8lNEnWKIvMSvzVH7mtgxKJFcJ2rC1XssKPATDa6HZZ tJv/tbcA+ztWFSjQoN+hIHkKENP4pGaqGxlP/Kxi+azSpJpl47iGIKg01lkNdY8L+jAa /PdM0puF0JlnFT9BUPF/FxwQme8r7dL5Y3WueKsN0hzmTMcrd/pbnoqRWi8ge6hSQJR8 bjDcEWNGAipCvbL9aMVkoQ2g6DXvWSi+9wdnpqeeV0jMDs+ip1QU49Bp95DkT12yhBMC StMg== X-Gm-Message-State: AOJu0YzLIHRlWW/Fa4w1rKKqpGVWpZ+blPfeLoEav+xxQ2Y/oT8ca4Yl Q3znycmoxtGbOQyud+uOQeV4ii50vUFq1VzGDv20W0yH53HyEsgyrjfs3AqNhL2Zp5jaJl2PkKA = X-Gm-Gg: ASbGncteal7aRHslrnsqkBXRRGSpAItfKZxdxh8ey+QCm3llhv8sHDf7L1kEnDWKoJC Z4phHpTdewoO/ET7kQOIsy0kJ/Xn2wt9C9rR54ZTj3Ai5ePBciEHIaDAMnjIxw13g9DEgluS3nQ l0CkqIskXbgOmUjx2qjc4DWHNYp/R6qB1dbF7BmFDxFE850xAditkixFjsgphoLmhmiRPmrGhVf +LMFblhFPr6ko9fqkfNyPsE3Pa1K79PAcnx6dXWpRxaKtEzG7vcD8OOHnG6pTJVf7IA8zptqDcP nuh+Q2skTtd7nDkD5JoTCFbaLoWGsQKrmfg= X-Google-Smtp-Source: AGHT+IHvwjLd5kmzGRQkoFoMcHYIYh6Q+DJWXj9akEU7qg/AZ4TkwT5HYdaZYSe5itUO6AvXuuQLQA== X-Received: by 2002:a05:6000:4028:b0:382:31a2:17fd with SMTP id ffacd0b85a97d-38260be5360mr11353125f8f.55.1732544885400; Mon, 25 Nov 2024 06:28:05 -0800 (PST) Message-ID: <62b3403f-3800-4c1e-a7a2-165ebfac04c0@suse.com> Date: Mon, 25 Nov 2024 15:28:03 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: [PATCH v3 2/7] x86: re-work memset() From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , Wei Liu , =?utf-8?q?Roger_Pau_Monn=C3=A9?= References: Content-Language: en-US Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: Move the function to its own assembly file. Having it in C just for the entire body to be an asm() isn't really helpful. Then have two flavors: A "basic" version using qword steps for the bulk of the operation, and an ERMS version for modern hardware, to be substituted in via alternatives patching. Signed-off-by: Jan Beulich --- We may want to consider branching over the REP STOSQ as well, if the number of qwords turns out to be zero. We may also want to consider using non-REP STOS{L,W,B} for the tail. --- v3: Re-base. --- a/xen/arch/x86/Makefile +++ b/xen/arch/x86/Makefile @@ -48,6 +48,7 @@ obj-$(CONFIG_INDIRECT_THUNK) += indirect obj-$(CONFIG_PV) += ioport_emulate.o obj-y += irq.o obj-$(CONFIG_KEXEC) += machine_kexec.o +obj-y += memset.o obj-y += mm.o x86_64/mm.o obj-$(CONFIG_HVM) += monitor.o obj-y += mpparse.o --- /dev/null +++ b/xen/arch/x86/memset.S @@ -0,0 +1,30 @@ +#include + +.macro memset + and $7, %edx + shr $3, %rcx + movzbl %sil, %esi + mov $0x0101010101010101, %rax + imul %rsi, %rax + mov %rdi, %rsi + rep stosq + or %edx, %ecx + jz 0f + rep stosb +0: + mov %rsi, %rax + ret +.endm + +.macro memset_erms + mov %esi, %eax + mov %rdi, %rsi + rep stosb + mov %rsi, %rax + ret +.endm + +FUNC(memset) + mov %rdx, %rcx + ALTERNATIVE memset, memset_erms, X86_FEATURE_ERMS +END(memset) --- a/xen/arch/x86/string.c +++ b/xen/arch/x86/string.c @@ -22,19 +22,6 @@ void *(memcpy)(void *dest, const void *s return dest; } -void *(memset)(void *s, int c, size_t n) -{ - long d0, d1; - - asm volatile ( - "rep stosb" - : "=&c" (d0), "=&D" (d1) - : "a" (c), "1" (s), "0" (n) - : "memory"); - - return s; -} - void *(memmove)(void *dest, const void *src, size_t n) { long d0, d1, d2; From patchwork Mon Nov 25 14:28:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 13885020 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 07728D58D51 for ; Mon, 25 Nov 2024 14:28:50 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.842636.1258312 (Exim 4.92) (envelope-from ) id 1tFa4s-0002tC-VW; Mon, 25 Nov 2024 14:28:38 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 842636.1258312; Mon, 25 Nov 2024 14:28:38 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tFa4s-0002t5-Sv; Mon, 25 Nov 2024 14:28:38 +0000 Received: by outflank-mailman (input) for mailman id 842636; Mon, 25 Nov 2024 14:28:37 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tFa4r-0002BZ-9S for xen-devel@lists.xenproject.org; Mon, 25 Nov 2024 14:28:37 +0000 Received: from mail-wm1-x332.google.com (mail-wm1-x332.google.com [2a00:1450:4864:20::332]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 8d555639-ab39-11ef-a0cd-8be0dac302b0; Mon, 25 Nov 2024 15:28:34 +0100 (CET) Received: by mail-wm1-x332.google.com with SMTP id 5b1f17b1804b1-434a1833367so3685865e9.1 for ; Mon, 25 Nov 2024 06:28:34 -0800 (PST) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-433cde98b4dsm131217735e9.43.2024.11.25.06.28.33 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 25 Nov 2024 06:28:33 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 8d555639-ab39-11ef-a0cd-8be0dac302b0 X-Custom-Connection: eyJyZW1vdGVpcCI6IjJhMDA6MTQ1MDo0ODY0OjIwOjozMzIiLCJoZWxvIjoibWFpbC13bTEteDMzMi5nb29nbGUuY29tIn0= X-Custom-Transaction: eyJpZCI6IjhkNTU1NjM5LWFiMzktMTFlZi1hMGNkLThiZTBkYWMzMDJiMCIsInRzIjoxNzMyNTQ0OTE0LjcyMTI0MSwic2VuZGVyIjoiamJldWxpY2hAc3VzZS5jb20iLCJyZWNpcGllbnQiOiJ4ZW4tZGV2ZWxAbGlzdHMueGVucHJvamVjdC5vcmcifQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1732544914; x=1733149714; darn=lists.xenproject.org; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=vwlIgFVepcsl+wzf8H/Zlc/IIxMm7m4dGrhI8Xxv5zE=; b=M5hXbPDRh7iyHm+X3wDjyTvi9xwBjTqdrePUnjtUsFEEwprNE0h51MAaiP6pp9JVSt FCy88yeW/Ip6NrSYmDhMLAX5Q3wwOXw12ZFLQGVEJXJhDAxPnjqk7OLZNA8vhJptAjq4 WUm8lhLn2QPhrYb4g55JjjHi9PTTHepy0DQjML8syfS+fVuZyvplISTdxPbS21ZOEoVF D5edL+tU7mwyjf/Opf2aA0gDaOtUzTkqPJUzno0PFOR6XCfgQ0JmTE5v31LI2i05CcsN VXS3PICm9zrjS4L5QdsNBQ9euAeBIFj2B6tQ+1oG0UFC0Pa7P44dkIcSbu4+TdE5NA4U sp/Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732544914; x=1733149714; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=vwlIgFVepcsl+wzf8H/Zlc/IIxMm7m4dGrhI8Xxv5zE=; b=Ux23qgZiGaocMPDwjTX7RKLTPoSBUtjj6ki6fRiNu5gI0Vg+qgPNJAq72ARbD/ru7s lXp8X5+48TLgVVlROhEmB+XRmc+y63RB7eLkSsVvdlkn7xDrxrG8VgcbxVneDNrIWF3a weQe026THYF1XKrL+8jzmnhR0ng2WVi+QInDzNUnL3couWvsjJLs4Oo5SPnKFeVjT4c8 k/jjNJElMvuX1zj1F3iMBK/y+Ddc22EWsPbqPksr0xwC7IUyXIF9AD0YOKyG5EgtdtD3 gQCOnEuu02KafAHmMbYnZzP3vzlz9AA3t65Y1Cj8p82cK47GhJWgmQw2DzvXf/+WfyCF AlLQ== X-Gm-Message-State: AOJu0YxbcacRb7Xn23HmIrUoE5Fp3EKZKiVUuSJFlxv8Evst6NVeRMnv tSfVJ9uC2sYIV1F/ZKjDUQEejEcN7R699Vh6QtCbpwQ37Gm2B6kyB3SQIp5r9Y1fUsKKwaRIKcQ = X-Gm-Gg: ASbGncvKoH3ug/jeLdwD5i7gB3aSarC/6iGtyGE8H6jE+2Ji6un+oGR+EaOp5MtJgpR yLhxObEUooP7TZMI/31vidOUU0BizeybLqHZgMrNfcxjnO5fgRsAHRNHjAI8b4+/JkGD1eP71LX TAU4caJGjxxOrJ5V3DqY6fTbgNl69OEvbH42i4wXMP/iOyEIxqiiXBlYiXmqFBLEs+XR22FOy+N oRaN48IVgxhREhuP+Xm645abTQ9LrNmlQ1wEEkEZeN4gO9v091UWwlHFvqKE78vBJNKvilfhXHp PvLK6LfrqctHbvCpddP/+YSYCowpC2Fhg4E= X-Google-Smtp-Source: AGHT+IHPn1srSHSV7z2sj7na7mK2217CiVbc+5xmZIdsmUArj24h1U7+RKTrIR8v9j85uE3ZZ14vZw== X-Received: by 2002:a05:600c:3ace:b0:431:57cf:f13d with SMTP id 5b1f17b1804b1-433cda09dbcmr110868235e9.3.1732544913895; Mon, 25 Nov 2024 06:28:33 -0800 (PST) Message-ID: Date: Mon, 25 Nov 2024 15:28:32 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: [PATCH v3 3/7] x86: re-work memcpy() From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , Wei Liu , =?utf-8?q?Roger_Pau_Monn=C3=A9?= References: Content-Language: en-US Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: Move the function to its own assembly file. Having it in C just for the entire body to be an asm() isn't really helpful. Then have two flavors: A "basic" version using qword steps for the bulk of the operation, and an ERMS version for modern hardware, to be substituted in via alternatives patching. Alternatives patching, however, requires an extra precaution: It uses memcpy() itself, and hence the function may patch itself. Luckily the patched-in code only replaces the prolog of the original function. Make sure this remains this way. Additionally alternatives patching, while supposedly safe via enforcing a control flow change when modifying already prefetched code, may not really be. Afaict a request is pending to drop the first of the two options in the SDM's "Handling Self- and Cross-Modifying Code" section. Insert a serializing instruction there. Signed-off-by: Jan Beulich --- We may want to consider branching over the REP MOVSQ as well, if the number of qwords turns out to be zero. We may also want to consider using non-REP MOVS{L,W,B} for the tail. TBD: We may further need a workaround similar to Linux'es 8ca97812c3c8 ("x86/mce: Work around an erratum on fast string copy instructions"). --- v3: Re-base. --- a/xen/arch/x86/Makefile +++ b/xen/arch/x86/Makefile @@ -48,6 +48,7 @@ obj-$(CONFIG_INDIRECT_THUNK) += indirect obj-$(CONFIG_PV) += ioport_emulate.o obj-y += irq.o obj-$(CONFIG_KEXEC) += machine_kexec.o +obj-y += memcpy.o obj-y += memset.o obj-y += mm.o x86_64/mm.o obj-$(CONFIG_HVM) += monitor.o --- a/xen/arch/x86/alternative.c +++ b/xen/arch/x86/alternative.c @@ -153,12 +153,14 @@ void init_or_livepatch add_nops(void *in * executing. * * "noinline" to cause control flow change and thus invalidate I$ and - * cause refetch after modification. + * cause refetch after modification. While the SDM continues to suggest this + * is sufficient, it may not be - issue a serializing insn afterwards as well. */ static void init_or_livepatch noinline text_poke(void *addr, const void *opcode, size_t len) { memcpy(addr, opcode, len); + cpuid_eax(0); } extern void *const __initdata_cf_clobber_start[]; --- /dev/null +++ b/xen/arch/x86/memcpy.S @@ -0,0 +1,20 @@ +#include + +FUNC(memcpy) + mov %rdx, %rcx + mov %rdi, %rax + /* + * We need to be careful here: memcpy() is involved in alternatives + * patching, so the code doing the actual copying (i.e. past setting + * up registers) may not be subject to patching (unless further + * precautions were taken). + */ + ALTERNATIVE "and $7, %edx; shr $3, %rcx", \ + "rep movsb; ret", X86_FEATURE_ERMS + rep movsq + or %edx, %ecx + jz 1f + rep movsb +1: + ret +END(memcpy) --- a/xen/arch/x86/string.c +++ b/xen/arch/x86/string.c @@ -7,21 +7,6 @@ #include -void *(memcpy)(void *dest, const void *src, size_t n) -{ - long d0, d1, d2; - - asm volatile ( - " rep ; movs"__OS" ; " - " mov %k4,%k3 ; " - " rep ; movsb " - : "=&c" (d0), "=&D" (d1), "=&S" (d2) - : "0" (n/BYTES_PER_LONG), "r" (n%BYTES_PER_LONG), "1" (dest), "2" (src) - : "memory" ); - - return dest; -} - void *(memmove)(void *dest, const void *src, size_t n) { long d0, d1, d2; From patchwork Mon Nov 25 14:29:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 13885037 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A5A5FD58D51 for ; Mon, 25 Nov 2024 14:29:15 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.842642.1258323 (Exim 4.92) (envelope-from ) id 1tFa5M-0003Ru-8T; Mon, 25 Nov 2024 14:29:08 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 842642.1258323; Mon, 25 Nov 2024 14:29:08 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tFa5M-0003Rn-4Y; Mon, 25 Nov 2024 14:29:08 +0000 Received: by outflank-mailman (input) for mailman id 842642; Mon, 25 Nov 2024 14:29:06 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tFa5K-0002BZ-8Y for xen-devel@lists.xenproject.org; Mon, 25 Nov 2024 14:29:06 +0000 Received: from mail-wm1-x32c.google.com (mail-wm1-x32c.google.com [2a00:1450:4864:20::32c]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id 9ecc9735-ab39-11ef-a0cd-8be0dac302b0; Mon, 25 Nov 2024 15:29:03 +0100 (CET) Received: by mail-wm1-x32c.google.com with SMTP id 5b1f17b1804b1-434a1fe2b43so3893185e9.2 for ; Mon, 25 Nov 2024 06:29:03 -0800 (PST) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-433b463abbfsm200763465e9.32.2024.11.25.06.29.01 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 25 Nov 2024 06:29:01 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 9ecc9735-ab39-11ef-a0cd-8be0dac302b0 X-Custom-Connection: eyJyZW1vdGVpcCI6IjJhMDA6MTQ1MDo0ODY0OjIwOjozMmMiLCJoZWxvIjoibWFpbC13bTEteDMyYy5nb29nbGUuY29tIn0= X-Custom-Transaction: eyJpZCI6IjllY2M5NzM1LWFiMzktMTFlZi1hMGNkLThiZTBkYWMzMDJiMCIsInRzIjoxNzMyNTQ0OTQzLjk0MTA1Miwic2VuZGVyIjoiamJldWxpY2hAc3VzZS5jb20iLCJyZWNpcGllbnQiOiJ4ZW4tZGV2ZWxAbGlzdHMueGVucHJvamVjdC5vcmcifQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1732544943; x=1733149743; darn=lists.xenproject.org; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=Ik+kdwvsn5sI13Sqvs0LoMAnkH/bBuVd5/sAq0CDmGU=; b=Il3rTUUPchL5fOtlOteck2i3OpbcWvYwakRA9DhY1Kc0+eA8WEgCjFa4rtXp1f+Y9S unahajGuthcGZwm5mwB63aSBaPtQLfkefaqOSgmno2CWOWwYug0z7GSlPs1FetnaHtt0 jKObspLYoBjNCO5J8Jvq0VnZ5RbcEkzq8TQMk6uy2c+fryPnS5Exlr/BpSJd04NEQZWl hEC+HN/pIm8jbdx5JwSGC87vXYNoVR/VJj0CQSBi/zPFKFCusBKvVkgtzkd5kgzkG7u/ YVLQnU+DWBsGQ9xSrJDFtwsrlTjQQZ5PDxwnm07E/lYquTgapkEE0Lm+rn+yw4ZWPB9G XUpQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732544943; x=1733149743; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Ik+kdwvsn5sI13Sqvs0LoMAnkH/bBuVd5/sAq0CDmGU=; b=Pu5w1aAsEg21seTO5RMr//K/FvMR3XVXIerxflBG7luyTdDrpzuIVb0jCgQWITImpC yr+ltuEtvKlAb7jsWs9Bmf1hoviLnXwJrAS9Lhn93xFnheSoUzQmJHgEM21e4JWK5rFH ekl8zCoPxxI+Xqi5QvlYFLduCHIhNtyVAFKS+cUSqdmasXOhv0ACSBFURqHXuVYncHQm JU+nkoxJn6ZVjuiDPHlO7WTzhjQzyjfoa3UysBushnxwjcRvPXDrJHJJW9VkqpUbRkNY KBNIpzrf7vkaNjpwDpR15ijP9rYB3iEhYu8sW/DT7/rgAtycT/JwENrJ976fwCTn63ls LV0Q== X-Gm-Message-State: AOJu0YxZt7kQIKGhg8RDANlbDvvs9S7QjBG/vyzveZWSiqBgbNE/2x4d Lg6IZ0b3evX7HpszupXTfd0M4ZJ2ET2wtHIGFVmZpr4oS+xHpSVxoZv0VIKD0OSm5Y2Lh+HTLjk = X-Gm-Gg: ASbGncvb3uVYWAeSEtdmop8ecD229pcZYZ9t3wgcgGlDDT2DuC+vv53q+QDxiWM1REi z4RgQ33rEZgC38rJuF93yayTr3gHm5gwuydyAqy7mZScIbMPYt0Rk2qhjZbSwjB2Q6eZsxBoF5D Q2wmsBn+KK5NOkvJH4sHap30h5+gdfo2ueHrvqAClAW5LRxVsFd3dcX2TyFg6OGHGsqK62XHNGE jOVH7pSJDvHNHCIjahS0BlcrC3HpG/Cxv5ECozfKmi6WB/fN0FdJSKQJjKpj7TRhyTkB/t6RWZH 1J3ssWntmvLQRNFyO3vr/H4aM4Z8lL7whzU= X-Google-Smtp-Source: AGHT+IHUshLW1UH2BD/cE563Qw8MRdVLVfUzlOYqpiV9f+lqWs3Om2jqwXsYN5ldzFhj2qZ8FQFuJA== X-Received: by 2002:a05:600c:4447:b0:42c:e0da:f15c with SMTP id 5b1f17b1804b1-433ce495857mr101725265e9.20.1732544941872; Mon, 25 Nov 2024 06:29:01 -0800 (PST) Message-ID: <1c935aba-a185-43de-9806-6781b1a7fcf9@suse.com> Date: Mon, 25 Nov 2024 15:29:00 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: [PATCH v3 4/7] x86: control memset() and memcpy() inlining From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , =?utf-8?q?Roger_Pau_Monn?= =?utf-8?q?=C3=A9?= References: Content-Language: en-US Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: Stop the compiler from inlining non-trivial memset() and memcpy() (for memset() see e.g. map_vcpu_info() or kimage_load_segments() for examples). This way we even keep the compiler from using REP STOSQ / REP MOVSQ when we'd prefer REP STOSB / REP MOVSB (when ERMS is available). With gcc10 this yields a modest .text size reduction (release build) of around 2k. Unfortunately these options aren't understood by the clang versions I have readily available for testing with; I'm unaware of equivalents. Note also that using cc-option-add is not an option here, or at least I couldn't make things work with it (in case the option was not supported by the compiler): The embedded comma in the option looks to be getting in the way. Requested-by: Andrew Cooper Signed-off-by: Jan Beulich --- v3: Re-base. v2: New. --- The boundary values are of course up for discussion - I wasn't really certain whether to use 16 or 32; I'd be less certain about using yet larger values. Similarly whether to permit the compiler to emit REP STOSQ / REP MOVSQ for known size, properly aligned blocks is up for discussion. --- a/xen/arch/x86/arch.mk +++ b/xen/arch/x86/arch.mk @@ -65,6 +65,9 @@ endif $(call cc-option-add,CFLAGS_stack_boundary,CC,-mpreferred-stack-boundary=3) export CFLAGS_stack_boundary +CFLAGS += $(call cc-option,$(CC),-mmemcpy-strategy=unrolled_loop:16:noalign$(comma)libcall:-1:noalign) +CFLAGS += $(call cc-option,$(CC),-mmemset-strategy=unrolled_loop:16:noalign$(comma)libcall:-1:noalign) + ifeq ($(CONFIG_UBSAN),y) # Don't enable alignment sanitisation. x86 has efficient unaligned accesses, # and various things (ACPI tables, hypercall pages, stubs, etc) are wont-fix. From patchwork Mon Nov 25 14:29:37 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 13885038 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 014AAD58D51 for ; Mon, 25 Nov 2024 14:29:55 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.842648.1258333 (Exim 4.92) (envelope-from ) id 1tFa5u-0003z6-Fw; Mon, 25 Nov 2024 14:29:42 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 842648.1258333; Mon, 25 Nov 2024 14:29:42 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tFa5u-0003yz-D4; Mon, 25 Nov 2024 14:29:42 +0000 Received: by outflank-mailman (input) for mailman id 842648; Mon, 25 Nov 2024 14:29:41 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tFa5t-0002BZ-R1 for xen-devel@lists.xenproject.org; Mon, 25 Nov 2024 14:29:41 +0000 Received: from mail-wr1-x42c.google.com (mail-wr1-x42c.google.com [2a00:1450:4864:20::42c]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id b3c14311-ab39-11ef-a0cd-8be0dac302b0; Mon, 25 Nov 2024 15:29:39 +0100 (CET) Received: by mail-wr1-x42c.google.com with SMTP id ffacd0b85a97d-3824709ee03so3233939f8f.2 for ; Mon, 25 Nov 2024 06:29:39 -0800 (PST) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-3825fb25d74sm10672707f8f.47.2024.11.25.06.29.37 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 25 Nov 2024 06:29:38 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: b3c14311-ab39-11ef-a0cd-8be0dac302b0 X-Custom-Connection: eyJyZW1vdGVpcCI6IjJhMDA6MTQ1MDo0ODY0OjIwOjo0MmMiLCJoZWxvIjoibWFpbC13cjEteDQyYy5nb29nbGUuY29tIn0= X-Custom-Transaction: eyJpZCI6ImIzYzE0MzExLWFiMzktMTFlZi1hMGNkLThiZTBkYWMzMDJiMCIsInRzIjoxNzMyNTQ0OTc5LjA5ODQ1OCwic2VuZGVyIjoiamJldWxpY2hAc3VzZS5jb20iLCJyZWNpcGllbnQiOiJ4ZW4tZGV2ZWxAbGlzdHMueGVucHJvamVjdC5vcmcifQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1732544978; x=1733149778; darn=lists.xenproject.org; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=Bo1b/k71KJVmsVTKGv+xBBoI5XLp0eOf8O7yEx73nPg=; b=gIeUyoxZiSDAkarMXGors94wUDeoN2odcO7sZOsqCLCXBe7GcH/4pj8x1PfH3kI+yg s5CCQBVSk6lnue5IZZuvqB/j2aaYq7TySXb0DPoxisqJ6jAJONPuvcGFQVNty3h4lxWO wnLdnF8s+Qok+PVYaEc1i4meRyXawvCPpln1xEo8SkS4rGOCWsSfSx9ugR7Bq3EpbcoC oN2qOaTLpDhMWLqpSxtDu7nNsgWAgNT7kPQd6P1Ew5hm/fBWRoESJzWdoPN9bVPZ28Az qJ+stMyXm1vhTrbeR5WPAXRSb2cJ1UBO/a1uP6vH6N83JQFvxTfNv8Sp6ziYHK5/FQMs LPcw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732544978; x=1733149778; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Bo1b/k71KJVmsVTKGv+xBBoI5XLp0eOf8O7yEx73nPg=; b=SaZKCjeP3S4E3mKgi3xXOf2BKezXatMYRRLnG/XPi2dJL4IJemf1qKnkiJKbF3vsry eDQN10isnCaKtczzyG/eewgWPDeZy77fRK4TPejpBfIAecv36LGiSBEG+8P+0D0RLchW w7uAnY1fmrojV8srD4RQPS4Rw7wgK8RdiCVp8d5O0wIfpoU1NUqjg0zIpOESmKK62yEK nWFVjd+kmxEEqco4tpcfjXO+EtY5Wv4n8F51blln4PqnSVmRpU3+Uvk540cZ7QvMRLt3 j774g1rrKxXlCibth7wBMgT6E29zrqbnqU2mM45bPv09mdmKJRTWlk9YD98DUq9kBrc7 sOiQ== X-Gm-Message-State: AOJu0YzT1UCjWXSHj90jjTcAzObXQ3mRGcaRFkirNUQRGByJ4FSVU5+Y tKLJ7/wvgVPFo4DhXf6n3EbvpRtnhZ4D+79BrtRpmONl8vXjPcURnvPwbA+AlpmhCSO37jSst70 = X-Gm-Gg: ASbGncsawxuG4umbXFSOVjqk9HzMJ6snXU3AA50TLlFY+09i1l/CWmfAkzKuxH5ou5d PAdV3mPxUeZ+VQbOhtb3VakbXAxJFty8FDWv03+QSN6Bk6uYEKs+AkGWEPZPJSklv/ezr3+FeuJ cWS41xTEJQagZZx+QNCH/sQkcofxUrSjEG3UJvvpI15CoSnNI4tFJMHlukRpbf/AwnyPED/kLeV eO3pJevK9TNEbLCWp9PuDNA9t72+ZfBb1T2ffoQEFtBDzQOjomkraE6I0VZxPbT56xzT5X5KmRS fguTWrOZ7mg6lIaf9Fd9reMuoUXMDAOkp0M= X-Google-Smtp-Source: AGHT+IGhPzzgGIa5DYDvO52WnZ8bAedlRMF9/46N7NpobcNH5E5vqf1siOpkme2EfIvBbgirB/+gYg== X-Received: by 2002:a05:6000:156b:b0:382:2e9e:d68c with SMTP id ffacd0b85a97d-38260bcd9c3mr12875662f8f.38.1732544978522; Mon, 25 Nov 2024 06:29:38 -0800 (PST) Message-ID: <250d879a-2e2e-4283-b943-0ae4835c55f0@suse.com> Date: Mon, 25 Nov 2024 15:29:37 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: [PATCH v3 5/7] x86: introduce "hot" and "cold" page clearing functions From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: Andrew Cooper , =?utf-8?q?Roger_Pau_Monn?= =?utf-8?q?=C3=A9?= References: Content-Language: en-US Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: The present clear_page_sse2() is useful in case a page isn't going to get touched again soon, or if we want to limit churn on the caches. Amend it by alternatively using CLZERO, which has been found to be quite a bit faster on Zen2 hardware at least. Note that to use CLZERO, we need to know the cache line size, and hence a feature dependency on CLFLUSH gets introduced. For cases where latency is the most important aspect, or when it is expected that sufficiently large parts of a page will get accessed again soon after the clearing, introduce a "hot" alternative. Again use alternatives patching to select between a "legacy" and an ERMS variant. Don't switch any callers just yet - this will be the subject of subsequent changes. Signed-off-by: Jan Beulich --- v3: Re-base. v2: New. --- Note: Ankur indicates that for ~L3-size or larger regions MOVNT/CLZERO is better even latency-wise. --- a/xen/arch/x86/clear_page.S +++ b/xen/arch/x86/clear_page.S @@ -1,9 +1,9 @@ .file __FILE__ -#include -#include +#include +#include -FUNC(clear_page_sse2) + .macro clear_page_sse2 mov $PAGE_SIZE/32, %ecx xor %eax,%eax @@ -17,4 +17,43 @@ FUNC(clear_page_sse2) sfence ret -END(clear_page_sse2) + .endm + + .macro clear_page_clzero + mov %rdi, %rax + mov $PAGE_SIZE/64, %ecx + .globl clear_page_clzero_post_count +clear_page_clzero_post_count: + +0: clzero + sub $-64, %rax + .globl clear_page_clzero_post_neg_size +clear_page_clzero_post_neg_size: + sub $1, %ecx + jnz 0b + + sfence + ret + .endm + +FUNC(clear_page_cold) + ALTERNATIVE clear_page_sse2, clear_page_clzero, X86_FEATURE_CLZERO +END(clear_page_cold) + + .macro clear_page_stosb + mov $PAGE_SIZE, %ecx + xor %eax,%eax + rep stosb + ret + .endm + + .macro clear_page_stosq + mov $PAGE_SIZE/8, %ecx + xor %eax, %eax + rep stosq + ret + .endm + +FUNC(clear_page_hot) + ALTERNATIVE clear_page_stosq, clear_page_stosb, X86_FEATURE_ERMS +END(clear_page_hot) --- a/xen/arch/x86/cpu/common.c +++ b/xen/arch/x86/cpu/common.c @@ -58,6 +58,9 @@ DEFINE_PER_CPU(bool, full_gdt_loaded); DEFINE_PER_CPU(uint32_t, pkrs); +extern uint32_t clear_page_clzero_post_count[]; +extern int8_t clear_page_clzero_post_neg_size[]; + void __init setup_clear_cpu_cap(unsigned int cap) { const uint32_t *dfs; @@ -355,8 +358,38 @@ void __init early_cpu_init(bool verbose) edx &= ~cleared_caps[FEATURESET_1d]; ecx &= ~cleared_caps[FEATURESET_1c]; - if (edx & cpufeat_mask(X86_FEATURE_CLFLUSH)) - c->x86_cache_alignment = ((ebx >> 8) & 0xff) * 8; + if (edx & cpufeat_mask(X86_FEATURE_CLFLUSH)) { + unsigned int size = ((ebx >> 8) & 0xff) * 8; + + c->x86_cache_alignment = size; + + /* + * Patch in parameters of clear_page_cold()'s CLZERO + * alternative. Note that for now we cap this at 128 bytes. + * Larger cache line sizes would still be dealt with + * correctly, but would cause redundant work done. + */ + if (size > 128) + size = 128; + if (size && !(size & (size - 1))) { + /* + * Need to play some games to keep the compiler from + * recognizing the negative array index as being out + * of bounds. The labels in assembler code really are + * _after_ the locations to be patched, so the + * negative index is intentional. + */ + uint32_t *pcount = clear_page_clzero_post_count; + int8_t *neg_size = clear_page_clzero_post_neg_size; + + OPTIMIZER_HIDE_VAR(pcount); + OPTIMIZER_HIDE_VAR(neg_size); + pcount[-1] = PAGE_SIZE / size; + neg_size[-1] = -size; + } + else + setup_clear_cpu_cap(X86_FEATURE_CLZERO); + } /* Leaf 0x1 capabilities filled in early for Xen. */ c->x86_capability[FEATURESET_1d] = edx; c->x86_capability[FEATURESET_1c] = ecx; --- a/xen/arch/x86/include/asm/asm-defns.h +++ b/xen/arch/x86/include/asm/asm-defns.h @@ -20,6 +20,10 @@ .byte 0x0f, 0x01, 0xdd .endm +.macro clzero + .byte 0x0f, 0x01, 0xfc +.endm + /* * Call a noreturn function. This could be JMP, but CALL results in a more * helpful backtrace. BUG is to catch functions which do decide to return... --- a/xen/arch/x86/include/asm/page.h +++ b/xen/arch/x86/include/asm/page.h @@ -219,10 +219,11 @@ typedef struct { u64 pfn; } pagetable_t; #define pagetable_from_paddr(p) pagetable_from_pfn((p)>>PAGE_SHIFT) #define pagetable_null() pagetable_from_pfn(0) -void clear_page_sse2(void *pg); +void clear_page_hot(void *pg); +void clear_page_cold(void *pg); void copy_page_sse2(void *to, const void *from); -#define clear_page(_p) clear_page_sse2(_p) +#define clear_page(_p) clear_page_cold(_p) #define copy_page(_t, _f) copy_page_sse2(_t, _f) /* Convert between Xen-heap virtual addresses and machine addresses. */ --- a/xen/tools/gen-cpuid.py +++ b/xen/tools/gen-cpuid.py @@ -212,6 +212,10 @@ def crunch_numbers(state): # the first place. APIC: [X2APIC, TSC_DEADLINE, EXTAPIC], + # The CLZERO insn requires a means to determine the cache line size, + # which is tied to the CLFLUSH insn. + CLFLUSH: [CLZERO], + # AMD built MMXExtentions and 3DNow as extentions to MMX. MMX: [MMXEXT, _3DNOW], From patchwork Mon Nov 25 14:30:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 13885045 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id EB699D58D50 for ; Mon, 25 Nov 2024 14:41:06 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.842686.1258352 (Exim 4.92) (envelope-from ) id 1tFaGc-0007mU-J5; Mon, 25 Nov 2024 14:40:46 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 842686.1258352; Mon, 25 Nov 2024 14:40:46 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tFaGc-0007mN-GX; Mon, 25 Nov 2024 14:40:46 +0000 Received: by outflank-mailman (input) for mailman id 842686; Mon, 25 Nov 2024 14:40:45 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tFa6h-0002BZ-Vu for xen-devel@lists.xenproject.org; Mon, 25 Nov 2024 14:30:31 +0000 Received: from mail-wr1-x42d.google.com (mail-wr1-x42d.google.com [2a00:1450:4864:20::42d]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id d1efdd90-ab39-11ef-a0cd-8be0dac302b0; Mon, 25 Nov 2024 15:30:29 +0100 (CET) Received: by mail-wr1-x42d.google.com with SMTP id ffacd0b85a97d-382325b0508so2975238f8f.3 for ; Mon, 25 Nov 2024 06:30:29 -0800 (PST) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-3825fad5fa2sm10803070f8f.1.2024.11.25.06.30.28 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 25 Nov 2024 06:30:28 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: d1efdd90-ab39-11ef-a0cd-8be0dac302b0 X-Custom-Connection: eyJyZW1vdGVpcCI6IjJhMDA6MTQ1MDo0ODY0OjIwOjo0MmQiLCJoZWxvIjoibWFpbC13cjEteDQyZC5nb29nbGUuY29tIn0= X-Custom-Transaction: eyJpZCI6ImQxZWZkZDkwLWFiMzktMTFlZi1hMGNkLThiZTBkYWMzMDJiMCIsInRzIjoxNzMyNTQ1MDI5LjcyNjY2NCwic2VuZGVyIjoiamJldWxpY2hAc3VzZS5jb20iLCJyZWNpcGllbnQiOiJ4ZW4tZGV2ZWxAbGlzdHMueGVucHJvamVjdC5vcmcifQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1732545029; x=1733149829; darn=lists.xenproject.org; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=1/WL1L2329FGbKiM5mFrIl564uR7QFJdztoThD/xDoA=; b=Y384DTsQi02MUHEf4d1EZYGCGrNpdsrGfObIdtfBmtDty/DxgLYCLABCdlEi8X6df8 DX4OfaqfD6GuPbbpzZFGtO+0X4YhJhHPH4p6MO5fs80PAaIwN2INYtA2lYc9yC1bXDmh gluQDM5DpThjHsF7FplvmBA2Ht2dmjulkWENi/2PzKPSEKJHHgPJEpPei4uxrkMVl3ey 10SQEdtGij/Xv63W4E2GOEQuyfRwSU/lo5dbZaIsL9kJLIfFu9V5cc73RDMVyIO/ggw2 HteAZFPDTDsn/a5q/GoXwlIc/jSMVO0E0i/mOUm6pTR5wiqU7i4zI6TvRxDJnymLQwQI Ockw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732545029; x=1733149829; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=1/WL1L2329FGbKiM5mFrIl564uR7QFJdztoThD/xDoA=; b=KMwYGyU79mCaLXBZZgxp+rwHxkO465GhbNO0ML1ySYG1lpuyW2FHyx0JpPpP+zzq13 k8YrrwwhtOP5yNASukjamfefWHP9QCvKveAD8TnK++NkfC4Gn0HrGfU4Dj9Ycnh+ZZPe i2SZ19j1wpH4zoGE8sTOPuCQMakTbgKPwC9BOPKcOB/88m/a3Y//bscNm45j+FteTzP1 su03Xj6AaWdVyLj9WOPtc4sg66bvp/apW5KCgmnZSRQJfrPRsWpTsUZ/APBYl92LXmXS AkUQZG62hAWS/wYB9V9vnS/MVyKAy9a5ZC6zVpqXr+VCvvGiNXg1FLocaRZxwX8BJSZg wsIw== X-Gm-Message-State: AOJu0YxxTXlqFp+Ae9Jmq5SKYiqPeRKEZ00nE2h9hxMqDXlPBiDnCJIF D07s5j5bzy6ayd16j2K8XJ+4bBVt49HmRY7wokMdMVsLMU0nXctBXWsiS/jWredoNWJuBAdrb4o = X-Gm-Gg: ASbGnctxxKpLiFiKCJRm8rz9KsBq22X8GbmOHC2R53giMgvE4tsU/29YfbqOc4OqELS Dozsx4jup5Csm97gPG8Z08BncjQSZhyZCsecmKj9boWXMWotYl/qpFYyuJ1GcXmwFGDOnm0gwH2 hTsPl3lyPt/fm6IP0Sjt19X+y/ip7Tc+E3YoT2KkoLZGtlktzPnAtTRTOYjfrHxcPpLXh2gC7vn utl/gjp//Z89+6TB1nNOZNp77gX2LH5Op/6/FhK2QNUyO4z/MTWjnJq3+Vpl4q307lZvS9buukv S/tIdtLlX9Kzvnz3u1Jep5LVVssVRmtzfCU= X-Google-Smtp-Source: AGHT+IEP5xw4wZLAanWgtdLiNIYvsXKiTQDyHGAK04XDdEBs+h/JexWpmUCaXWJK0giEUVFMt7t1eQ== X-Received: by 2002:a5d:59a2:0:b0:382:516e:271b with SMTP id ffacd0b85a97d-38260bfada9mr8904700f8f.58.1732545029096; Mon, 25 Nov 2024 06:30:29 -0800 (PST) Message-ID: Date: Mon, 25 Nov 2024 15:30:27 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: [PATCH v3 6/7] page-alloc: make scrub_one_page() static From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: =?utf-8?q?Roger_Pau_Monn=C3=A9?= , Andrew Cooper , Julien Grall , Stefano Stabellini References: Content-Language: en-US Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: Before starting to alter its properties, restrict the function's visibility. The only external user is mem-paging, which we can accommodate by different means. Also move the function up in its source file, so we won't need to forward-declare it. Constify its parameter at the same time. Signed-off-by: Jan Beulich Acked-by: Julien Grall --- v3: Re-base. v2: New. --- a/xen/arch/x86/mm/mem_paging.c +++ b/xen/arch/x86/mm/mem_paging.c @@ -304,9 +304,6 @@ static int evict(struct domain *d, gfn_t ret = p2m_set_entry(p2m, gfn, INVALID_MFN, PAGE_ORDER_4K, p2m_ram_paged, a); - /* Clear content before returning the page to Xen */ - scrub_one_page(page); - /* Track number of paged gfns */ atomic_inc(&d->paged_pages); --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c @@ -137,6 +137,7 @@ #include #include #include +#include #include #include @@ -774,6 +775,21 @@ static void page_list_add_scrub(struct p #endif #define SCRUB_BYTE_PATTERN (SCRUB_PATTERN & 0xff) +static void scrub_one_page(const struct page_info *pg) +{ + if ( unlikely(pg->count_info & PGC_broken) ) + return; + +#ifndef NDEBUG + /* Avoid callers relying on allocations returning zeroed pages. */ + unmap_domain_page(memset(__map_domain_page(pg), + SCRUB_BYTE_PATTERN, PAGE_SIZE)); +#else + /* For a production build, clear_page() is the fastest way to scrub. */ + clear_domain_page(_mfn(page_to_mfn(pg))); +#endif +} + static void poison_one_page(struct page_info *pg) { #ifdef CONFIG_SCRUB_DEBUG @@ -2548,10 +2564,12 @@ void free_domheap_pages(struct page_info /* * Normally we expect a domain to clear pages before freeing them, * if it cares about the secrecy of their contents. However, after - * a domain has died we assume responsibility for erasure. We do - * scrub regardless if option scrub_domheap is set. + * a domain has died or if it has mem-paging enabled we assume + * responsibility for erasure. We do scrub regardless if option + * scrub_domheap is set. */ - scrub = d->is_dying || scrub_debug || opt_scrub_domheap; + scrub = d->is_dying || mem_paging_enabled(d) || + scrub_debug || opt_scrub_domheap; } else { @@ -2635,22 +2653,6 @@ static __init int cf_check pagealloc_key } __initcall(pagealloc_keyhandler_init); - -void scrub_one_page(struct page_info *pg) -{ - if ( unlikely(pg->count_info & PGC_broken) ) - return; - -#ifndef NDEBUG - /* Avoid callers relying on allocations returning zeroed pages. */ - unmap_domain_page(memset(__map_domain_page(pg), - SCRUB_BYTE_PATTERN, PAGE_SIZE)); -#else - /* For a production build, clear_page() is the fastest way to scrub. */ - clear_domain_page(_mfn(page_to_mfn(pg))); -#endif -} - static void cf_check dump_heap(unsigned char key) { s_time_t now = NOW(); --- a/xen/arch/x86/include/asm/mem_paging.h +++ b/xen/arch/x86/include/asm/mem_paging.h @@ -12,12 +12,6 @@ int mem_paging_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_paging_op_t) arg); -#ifdef CONFIG_MEM_PAGING -# define mem_paging_enabled(d) vm_event_check_ring((d)->vm_event_paging) -#else -# define mem_paging_enabled(d) false -#endif - #endif /*__ASM_X86_MEM_PAGING_H__ */ /* --- a/xen/include/xen/mm.h +++ b/xen/include/xen/mm.h @@ -532,8 +532,6 @@ static inline unsigned int get_order_fro return order; } -void scrub_one_page(struct page_info *pg); - #ifndef arch_free_heap_page #define arch_free_heap_page(d, pg) \ page_list_del(pg, page_to_list(d, pg)) --- a/xen/include/xen/sched.h +++ b/xen/include/xen/sched.h @@ -1203,6 +1203,12 @@ static always_inline bool is_iommu_enabl return evaluate_nospec(d->options & XEN_DOMCTL_CDF_iommu); } +#ifdef CONFIG_MEM_PAGING +# define mem_paging_enabled(d) vm_event_check_ring((d)->vm_event_paging) +#else +# define mem_paging_enabled(d) false +#endif + extern bool sched_smt_power_savings; extern bool sched_disable_smt_switching; From patchwork Mon Nov 25 14:32:23 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 13885039 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 837C4D58D53 for ; Mon, 25 Nov 2024 14:32:42 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.842662.1258342 (Exim 4.92) (envelope-from ) id 1tFa8c-0005ir-PF; Mon, 25 Nov 2024 14:32:30 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 842662.1258342; Mon, 25 Nov 2024 14:32:30 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tFa8c-0005ik-Mg; Mon, 25 Nov 2024 14:32:30 +0000 Received: by outflank-mailman (input) for mailman id 842662; Mon, 25 Nov 2024 14:32:30 +0000 Received: from se1-gles-flk1-in.inumbo.com ([94.247.172.50] helo=se1-gles-flk1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1tFa8c-0005ie-7F for xen-devel@lists.xenproject.org; Mon, 25 Nov 2024 14:32:30 +0000 Received: from mail-wm1-x32b.google.com (mail-wm1-x32b.google.com [2a00:1450:4864:20::32b]) by se1-gles-flk1.inumbo.com (Halon) with ESMTPS id 17451e63-ab3a-11ef-99a3-01e77a169b0f; Mon, 25 Nov 2024 15:32:26 +0100 (CET) Received: by mail-wm1-x32b.google.com with SMTP id 5b1f17b1804b1-434a1833367so3718805e9.1 for ; Mon, 25 Nov 2024 06:32:26 -0800 (PST) Received: from [10.156.60.236] (ip-037-024-206-209.um08.pools.vodafone-ip.de. [37.24.206.209]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-434932dbc7esm78932905e9.37.2024.11.25.06.32.24 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 25 Nov 2024 06:32:24 -0800 (PST) X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: 17451e63-ab3a-11ef-99a3-01e77a169b0f X-Custom-Connection: eyJyZW1vdGVpcCI6IjJhMDA6MTQ1MDo0ODY0OjIwOjozMmIiLCJoZWxvIjoibWFpbC13bTEteDMyYi5nb29nbGUuY29tIn0= X-Custom-Transaction: eyJpZCI6IjE3NDUxZTYzLWFiM2EtMTFlZi05OWEzLTAxZTc3YTE2OWIwZiIsInRzIjoxNzMyNTQ1MTQ2LjA3MDcwNiwic2VuZGVyIjoiamJldWxpY2hAc3VzZS5jb20iLCJyZWNpcGllbnQiOiJ4ZW4tZGV2ZWxAbGlzdHMueGVucHJvamVjdC5vcmcifQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1732545145; x=1733149945; darn=lists.xenproject.org; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=q3dfeu1CINxWaia0Kmp6+ReqAbfDxHiKGmjMUR9eVkU=; b=FCM2l/DQwvuoDW6aDoYeY26pvtoP7DsDgEMX+mryQn+7qu+H0U/kw3BdhhD6PmZRRx wUgkGvilthsZ/942t5/Vc6uHcsVewph66Iwg7/phKYGU5C9by82QivN8Jys/hOhzzQ95 Cc/1Katoug3Kf2/wOfXMAj/qUzBdXUql8PU07fhjC8ZxHlWt8JW5muz8luAxVoKcDbW9 qwEMhUCN9tOaXvk7hXILnVANRYU64HBtJ835s+VDK9xWZfIzzUoYMw/XhMG8TqTO4m3h ESYoze3egggDDiRQC7Zeq+7BdD6e+O8RGlFAd+Y+MaCDJv9bEgWSFi4t2z/Xuq8s3dcm M6TA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732545145; x=1733149945; h=content-transfer-encoding:in-reply-to:autocrypt:content-language :references:cc:to:from:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=q3dfeu1CINxWaia0Kmp6+ReqAbfDxHiKGmjMUR9eVkU=; b=iSpQHfn72owrKIfrgXQupTX/W8cBHSgaom2O/wglPxXOy/YpEPZdj/t3zJcd8oVTed cBSR5T3yZN3l301E7p91mB6WKkM6WG5aLjfbdpKTd1IKx3XkRagNnFkSxNUqzW1vl6yH qVlZ6OWPCyjzmphZiD1RVlCUO3xcocT3dtMfdQGdvV2DkJYphQ46AwCfJJMZVJRULRDd eT2dtrqUe+KtpfA4FQvAtiUrqxOfsQR/u4Mwop9Tpn9rhqQAKFpj06p0WYRyPqCE6EKW h6g/3CWpJKSdo/1HhsXzfBYDjKERjND1JfhKt8lgykIRSXrdEeV89bzfK5esdnc/VDf9 iDRQ== X-Gm-Message-State: AOJu0YxGp+6P4sqKtPhpwo1AGuwveIRmyCQnCp//Qyt9kiaFbmdHB05x YgUUBPycHR0/ugOQjsdg/LUqsZKM0hDTO3zmbdYMQasey52au6ETLmsJTCRtV+I7SLN9JLH3AIw = X-Gm-Gg: ASbGnctgoddRQ/GmF6mkY83hHUu1ohHcfPaSxQIXtzlTHRi6NUW0lHwnW4fLEoNzH5i R6YGw40fBKRi3u7VDaBmHI2jfMS9aYlTBfNv4iQwttkRAK6h/2iD9a6+IksKVlwOV6GYXHkuLt/ BTUmED+Smoj8skznGxCbGFJdueKfoovbmryCZmPeDqIE8sWmmOSKfPOzLJb91SuXPVy4Q+wVr5O IBqf52n3q1+EswlCUWSOop3qf5VyoaO4F8TFDONGUqcOGanMuqEPbJeWl4vQbs05cHbhykC6r+b TWa00ZQBTD42MKJSWetdI1Y1Fyo3I3bU+7M= X-Google-Smtp-Source: AGHT+IGQ9LvePiJOOWXSL7hKIUlIZvi7wl/hKr+3yGcjstEyQOOan4WbQMaxLph2Hw0aOyX5GFq8wQ== X-Received: by 2002:a05:600c:3b27:b0:431:9340:77e0 with SMTP id 5b1f17b1804b1-433cdb0b3b6mr104045795e9.9.1732545145422; Mon, 25 Nov 2024 06:32:25 -0800 (PST) Message-ID: <49b0a003-3fae-4908-ba63-a1c764293755@suse.com> Date: Mon, 25 Nov 2024 15:32:23 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: [PATCH v3 7/7] mm: allow page scrubbing routine(s) to be arch controlled From: Jan Beulich To: "xen-devel@lists.xenproject.org" Cc: =?utf-8?q?Roger_Pau_Monn=C3=A9?= , Andrew Cooper , Julien Grall , Stefano Stabellini , Volodymyr Babchuk , Bertrand Marquis , Michal Orzel , Bobby Eshleman , Alistair Francis , Connor Davis , Shawn Anastasio References: Content-Language: en-US Autocrypt: addr=jbeulich@suse.com; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL In-Reply-To: Especially when dealing with large amounts of memory, memset() may not be very efficient; this can be bad enough that even for debug builds a custom function is warranted. We additionally want to distinguish "hot" and "cold" cases (with, as initial heuristic, "hot" being for any allocations a domain does for itself, assuming that in all other cases the page wouldn't be accessed [again] soon). The goal is for accesses of "cold" pages to not disturb caches (albeit finding a good balance between this and the higher latency looks to be difficult). Keep the default fallback to clear_page_*() in common code; this may want to be revisited down the road. Signed-off-by: Jan Beulich Acked-by: Julien Grall --- v3: Re-base. v2: New. --- The choice between hot and cold in scrub_one_page()'s callers is certainly up for discussion / improvement. --- a/xen/arch/arm/include/asm/page.h +++ b/xen/arch/arm/include/asm/page.h @@ -144,6 +144,12 @@ extern size_t dcache_line_bytes; #define copy_page(dp, sp) memcpy(dp, sp, PAGE_SIZE) +#define clear_page_hot clear_page +#define clear_page_cold clear_page + +#define scrub_page_hot(page) memset(page, SCRUB_BYTE_PATTERN, PAGE_SIZE) +#define scrub_page_cold scrub_page_hot + static inline size_t read_dcache_line_bytes(void) { register_t ctr; --- a/xen/arch/ppc/include/asm/page.h +++ b/xen/arch/ppc/include/asm/page.h @@ -190,6 +190,12 @@ static inline void invalidate_icache(voi #define clear_page(page) memset(page, 0, PAGE_SIZE) #define copy_page(dp, sp) memcpy(dp, sp, PAGE_SIZE) +#define clear_page_hot clear_page +#define clear_page_cold clear_page + +#define scrub_page_hot(page) memset(page, SCRUB_BYTE_PATTERN, PAGE_SIZE) +#define scrub_page_cold scrub_page_hot + /* TODO: Flush the dcache for an entire page. */ static inline void flush_page_to_ram(unsigned long mfn, bool sync_icache) { --- a/xen/arch/riscv/include/asm/page.h +++ b/xen/arch/riscv/include/asm/page.h @@ -156,6 +156,12 @@ static inline void invalidate_icache(voi #define clear_page(page) memset((void *)(page), 0, PAGE_SIZE) #define copy_page(dp, sp) memcpy(dp, sp, PAGE_SIZE) +#define clear_page_hot clear_page +#define clear_page_cold clear_page + +#define scrub_page_hot(page) memset(page, SCRUB_BYTE_PATTERN, PAGE_SIZE) +#define scrub_page_cold scrub_page_hot + /* TODO: Flush the dcache for an entire page. */ static inline void flush_page_to_ram(unsigned long mfn, bool sync_icache) { --- a/xen/arch/x86/Makefile +++ b/xen/arch/x86/Makefile @@ -59,6 +59,7 @@ obj-y += pci.o obj-y += physdev.o obj-$(CONFIG_COMPAT) += x86_64/physdev.o obj-$(CONFIG_X86_PSR) += psr.o +obj-bin-$(CONFIG_DEBUG) += scrub_page.o obj-y += setup.o obj-y += shutdown.o obj-y += smp.o --- a/xen/arch/x86/include/asm/page.h +++ b/xen/arch/x86/include/asm/page.h @@ -226,6 +226,11 @@ void copy_page_sse2(void *to, const void #define clear_page(_p) clear_page_cold(_p) #define copy_page(_t, _f) copy_page_sse2(_t, _f) +#ifdef CONFIG_DEBUG +void scrub_page_hot(void *); +void scrub_page_cold(void *); +#endif + /* Convert between Xen-heap virtual addresses and machine addresses. */ #define __pa(x) (virt_to_maddr(x)) #define __va(x) (maddr_to_virt(x)) --- /dev/null +++ b/xen/arch/x86/scrub_page.S @@ -0,0 +1,39 @@ + .file __FILE__ + +#include +#include +#include + +FUNC(scrub_page_cold) + mov $PAGE_SIZE/32, %ecx + mov $SCRUB_PATTERN, %rax + +0: movnti %rax, (%rdi) + movnti %rax, 8(%rdi) + movnti %rax, 16(%rdi) + movnti %rax, 24(%rdi) + add $32, %rdi + sub $1, %ecx + jnz 0b + + sfence + ret +END(scrub_page_cold) + + .macro scrub_page_stosb + mov $PAGE_SIZE, %ecx + mov $SCRUB_BYTE_PATTERN, %eax + rep stosb + ret + .endm + + .macro scrub_page_stosq + mov $PAGE_SIZE/8, %ecx + mov $SCRUB_PATTERN, %rax + rep stosq + ret + .endm + +FUNC(scrub_page_hot) + ALTERNATIVE scrub_page_stosq, scrub_page_stosb, X86_FEATURE_ERMS +END(scrub_page_hot) --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c @@ -134,6 +134,7 @@ #include #include #include +#include #include #include #include @@ -767,27 +768,31 @@ static void page_list_add_scrub(struct p page_list_add(pg, &heap(node, zone, order)); } -/* SCRUB_PATTERN needs to be a repeating series of bytes. */ -#ifndef NDEBUG -#define SCRUB_PATTERN 0xc2c2c2c2c2c2c2c2ULL -#else -#define SCRUB_PATTERN 0ULL +/* + * While in debug builds we want callers to avoid relying on allocations + * returning zeroed pages, for a production build, clear_page_*() is the + * fastest way to scrub. + */ +#ifndef CONFIG_DEBUG +# undef scrub_page_hot +# define scrub_page_hot clear_page_hot +# undef scrub_page_cold +# define scrub_page_cold clear_page_cold #endif -#define SCRUB_BYTE_PATTERN (SCRUB_PATTERN & 0xff) -static void scrub_one_page(const struct page_info *pg) +static void scrub_one_page(const struct page_info *pg, bool cold) { + void *ptr; + if ( unlikely(pg->count_info & PGC_broken) ) return; -#ifndef NDEBUG - /* Avoid callers relying on allocations returning zeroed pages. */ - unmap_domain_page(memset(__map_domain_page(pg), - SCRUB_BYTE_PATTERN, PAGE_SIZE)); -#else - /* For a production build, clear_page() is the fastest way to scrub. */ - clear_domain_page(_mfn(page_to_mfn(pg))); -#endif + ptr = __map_domain_page(pg); + if ( cold ) + scrub_page_cold(ptr); + else + scrub_page_hot(ptr); + unmap_domain_page(ptr); } static void poison_one_page(struct page_info *pg) @@ -1067,12 +1072,14 @@ static struct page_info *alloc_heap_page if ( first_dirty != INVALID_DIRTY_IDX || (scrub_debug && !(memflags & MEMF_no_scrub)) ) { + bool cold = d && d != current->domain; + for ( i = 0; i < (1U << order); i++ ) { if ( test_and_clear_bit(_PGC_need_scrub, &pg[i].count_info) ) { if ( !(memflags & MEMF_no_scrub) ) - scrub_one_page(&pg[i]); + scrub_one_page(&pg[i], cold); dirty_cnt++; } @@ -1337,7 +1344,7 @@ bool scrub_free_pages(void) { if ( test_bit(_PGC_need_scrub, &pg[i].count_info) ) { - scrub_one_page(&pg[i]); + scrub_one_page(&pg[i], true); /* * We can modify count_info without holding heap * lock since we effectively locked this buddy by @@ -2042,7 +2049,7 @@ static void __init cf_check smp_scrub_he if ( !mfn_valid(_mfn(mfn)) || !page_state_is(pg, free) ) continue; - scrub_one_page(pg); + scrub_one_page(pg, true); } } @@ -2735,7 +2742,7 @@ void unprepare_staticmem_pages(struct pa if ( need_scrub ) { /* TODO: asynchronous scrubbing for pages of static memory. */ - scrub_one_page(pg); + scrub_one_page(pg, true); } pg[i].count_info |= PGC_static; --- /dev/null +++ b/xen/include/xen/scrub.h @@ -0,0 +1,24 @@ +#ifndef __XEN_SCRUB_H__ +#define __XEN_SCRUB_H__ + +#include + +/* SCRUB_PATTERN needs to be a repeating series of bytes. */ +#ifdef CONFIG_DEBUG +# define SCRUB_PATTERN _AC(0xc2c2c2c2c2c2c2c2,ULL) +#else +# define SCRUB_PATTERN _AC(0,ULL) +#endif +#define SCRUB_BYTE_PATTERN (SCRUB_PATTERN & 0xff) + +#endif /* __XEN_SCRUB_H__ */ + +/* + * Local variables: + * mode: C + * c-file-style: "BSD" + * c-basic-offset: 4 + * tab-width: 4 + * indent-tabs-mode: nil + * End: + */