From patchwork Fri Oct 18 06:40:49 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Sridhar, Kanchana P" X-Patchwork-Id: 13841228 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A7A19D3C54C for ; Fri, 18 Oct 2024 06:41:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 39F286B0085; Fri, 18 Oct 2024 02:41:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 329046B008A; Fri, 18 Oct 2024 02:41:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 01E946B0088; Fri, 18 Oct 2024 02:41:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id D41456B0085 for ; Fri, 18 Oct 2024 02:41:06 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 014B1C173E for ; Fri, 18 Oct 2024 06:40:53 +0000 (UTC) X-FDA: 82685775270.08.1031922 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) by imf27.hostedemail.com (Postfix) with ESMTP id 9621940005 for ; Fri, 18 Oct 2024 06:40:53 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=T6RzztvN; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf27.hostedemail.com: domain of kanchana.p.sridhar@intel.com designates 192.198.163.15 as permitted sender) smtp.mailfrom=kanchana.p.sridhar@intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729233545; a=rsa-sha256; cv=none; b=BXRmj6V3obXnAFbFjdzFFFJrK8swgqHMpc9CTBLOQj60+4h2zeOgbnKwtWbq6NTa4gjRwF KUAvsT1uROoz810DijEn4HMfpMhdljXY6yJYdbIbp4C+sshKWZ7/MzhQFczr00v3ufHadM KG3B0ycPqKVsSAo7DlUlbVUZKP8qKBc= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=T6RzztvN; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf27.hostedemail.com: domain of kanchana.p.sridhar@intel.com designates 192.198.163.15 as permitted sender) smtp.mailfrom=kanchana.p.sridhar@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729233545; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=5uPDs9Ny1P2DVNOPHwMNKagj/oUSxXThkw3t9wLua2A=; b=3BqmMAUKhDtt2u1bgCpalglmCWoZHBbhAn9uFTzy63Gh0tbxeLG5tdBxtjlvQ5m4MEvMfS m72bAswk/xDbobUZcul5q7gnAIy8i0tT5YlVnTiaZdw6/4NkyFD9SokAwZk0YXAHXU/mGC IwFHZS91RCkuIrdXoadAWZJYKF4qT2U= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1729233664; x=1760769664; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=0lpsj1UvR6AcnHgvukFN7BtQLeSUmvJMMadL8XBpSFM=; b=T6RzztvNKL4z6FnpNAdlXRbdPG5W+jid9EH+1WysK7m35wRpnS/FzTjA M8evfOAQgAjrKGKtY5/RxmAAFD6gKTmXkvGfQ8JLUk0fBkIGunylGS15o PEeF9ONWgFpJLT46hdO9tPdMNhDP+wt1M9lVCcuYTLZCq3kj7yv7LmxuX oZk4KTSdehX6H9VbNu/sxFL83ZwiM6gMzVmbDkxMvxmmpo1FmSoKQWV7x VLSCu5iAPFc6oHInSnoqITVMftcR+7r4an617mHs1rAwGeKH4PnmdsYAN ZreT3vA7wOKuNTrqG5qdjpqJh0FauMs9/Mtow4EB/hNeAJ5oZX1lv4oNt w==; X-CSE-ConnectionGUID: NN5/VhBNRgijujjD3gBH+g== X-CSE-MsgGUID: /ckLpereQtOYWyQHWTlg3A== X-IronPort-AV: E=McAfee;i="6700,10204,11228"; a="28884782" X-IronPort-AV: E=Sophos;i="6.11,212,1725346800"; d="scan'208";a="28884782" Received: from orviesa003.jf.intel.com ([10.64.159.143]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Oct 2024 23:41:02 -0700 X-CSE-ConnectionGUID: EXdOqHO4QOiTM5LpwBoMEQ== X-CSE-MsgGUID: X736q6ymTDK46QF8X8oAwg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.11,199,1725346800"; d="scan'208";a="83607487" Received: from jf5300-b11a338t.jf.intel.com ([10.242.51.6]) by orviesa003.jf.intel.com with ESMTP; 17 Oct 2024 23:41:01 -0700 From: Kanchana P Sridhar To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, hannes@cmpxchg.org, yosryahmed@google.com, nphamcs@gmail.com, chengming.zhou@linux.dev, usamaarif642@gmail.com, ryan.roberts@arm.com, ying.huang@intel.com, 21cnbao@gmail.com, akpm@linux-foundation.org, linux-crypto@vger.kernel.org, herbert@gondor.apana.org.au, davem@davemloft.net, clabbe@baylibre.com, ardb@kernel.org, ebiggers@google.com, surenb@google.com, kristen.c.accardi@intel.com, zanussi@kernel.org, viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz, mcgrof@kernel.org, kees@kernel.org, joel.granados@kernel.org, bfoster@redhat.com, willy@infradead.org, linux-fsdevel@vger.kernel.org Cc: wajdi.k.feghali@intel.com, vinodh.gopal@intel.com, kanchana.p.sridhar@intel.com Subject: [RFC PATCH v1 01/13] crypto: acomp - Add a poll() operation to acomp_alg and acomp_req Date: Thu, 17 Oct 2024 23:40:49 -0700 Message-Id: <20241018064101.336232-2-kanchana.p.sridhar@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20241018064101.336232-1-kanchana.p.sridhar@intel.com> References: <20241018064101.336232-1-kanchana.p.sridhar@intel.com> MIME-Version: 1.0 X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 9621940005 X-Stat-Signature: g1qngc9wrq7hf9ece15s7pi4e1959ob9 X-Rspam-User: X-HE-Tag: 1729233653-368533 X-HE-Meta: U2FsdGVkX180FIFhkNjk0kVKLCe5zyyp3CPvHX2+HyRvlA281XjuUUcKXjLP/ABAhpD+xiMTIK1SYDTbpuCSSaN4KNTHbYneQsvgYUObsGLR+SAdf916Ra16Hg+vx3XDlcmENBdYG6M4/oRdd+Zy4Z59IhDB3UUliTYha9pAkddkgPNk31OGkChUooGntb9rlcTJ6B7sbHk0IPagfpoZ3iL0jUKrgQ9Tr1pcY6B730S8xDos4J4ATgPSeGriZ1rNrsVl3+t2v8GX+t+HBHn2IArLnu5KXLhyi/iqyTN+39EnJ8iWB1XfTnTBhEcN1h+qFQpakPJQw+JuYmfK3GPXd286uBU8+7/mCXArCg/BTtlT5dhOHc/vMve22jWI2N9VbgYXDIyhD5GHPjwlPuS+Ao8KfyOuRtq40U43LUOSEy8cpWQgxrz7EBBu4E4VvsmnevPZrN8SquPdMIxaT2UPk+YDDfNenfIwroPDshRX1IAtT5VYoDfUW8WgdKq/CDXJDOIUNJaBZ26WbQvLCMt6xwyDhlvOPuCAmH44fPtl4F/9jSdtvvC27hNX6mQOeVtod7WM/nrnDUXSykJ88crb1dXmTJEZYvSESFjm3wNZ2/jvkOp3LlowoCS12gOFyW3q7NQBHzuHrlVwh8qPFmSWWcmepHvYAAVFXsNzr9BXbCiIISQ2WNW1UjfvgQJ1gDl1vLCMSVMHud56SZ0P17JkUDjQyJV0TffQHMBP2sJAgLHLYLlRVpBuAdHoPggAP5gUU/fzNXCcO/UGwCpsFxs7sQ+xAegMIAfc2B9l6/02GrYQ/R9GPUj5UtSxJ2slcDswj/0aWNA1ync3wnz2vZoqhNGAv1Wf9aI8adhhbLx94hKXPQ7NvGjsz/XVzwY4PkqjnUgy1ieyXV9jHiQ1QuO33SgOvNTdfAg+3Rsg31eoBYAsbtXoxXS61yIWJmvh+UA2zAuunDK9xv1mEjw5Wph F34oWI73 clrTH5Vtlziny1qBs4Z49I2GEoMIB08JJbZMDYumPrgNhooiYGs1RAgljpDhe8S1rvDE6vfMWkjqGC/piwoWXmx+Bda1vfVgse8IOBDYpG1uctHoRkGxaFAR6XSYF0FlDm7+V0V01zbDkkYbdDt1QLxaIMYKZ7X4JmxZDsL7zjCCSNLwV8SkIp6XAzL51zxQB5bLiKwHEwFtZNDS9BKYhdtFcQessL6ZhFUBt X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: For async compress/decompress, provide a way for the caller to poll for compress/decompress completion, rather than wait for an interrupt to signal completion. Callers can submit a compress/decompress using crypto_acomp_compress and decompress and rather than wait on a completion, call crypto_acomp_poll() to check for completion. This is useful for hardware accelerators where the overhead of interrupts and waiting for completions is too expensive. Typically the compress/decompress hw operations complete very quickly and in the vast majority of cases, adding the overhead of interrupt handling and waiting for completions simply adds unnecessary delays and cancels the gains of using the hw acceleration. Signed-off-by: Tom Zanussi Signed-off-by: Kanchana P Sridhar --- crypto/acompress.c | 1 + include/crypto/acompress.h | 18 ++++++++++++++++++ include/crypto/internal/acompress.h | 1 + 3 files changed, 20 insertions(+) diff --git a/crypto/acompress.c b/crypto/acompress.c index 6fdf0ff9f3c0..00ec7faa2714 100644 --- a/crypto/acompress.c +++ b/crypto/acompress.c @@ -71,6 +71,7 @@ static int crypto_acomp_init_tfm(struct crypto_tfm *tfm) acomp->compress = alg->compress; acomp->decompress = alg->decompress; + acomp->poll = alg->poll; acomp->dst_free = alg->dst_free; acomp->reqsize = alg->reqsize; diff --git a/include/crypto/acompress.h b/include/crypto/acompress.h index 54937b615239..65b5de30c8b1 100644 --- a/include/crypto/acompress.h +++ b/include/crypto/acompress.h @@ -51,6 +51,7 @@ struct acomp_req { struct crypto_acomp { int (*compress)(struct acomp_req *req); int (*decompress)(struct acomp_req *req); + int (*poll)(struct acomp_req *req); void (*dst_free)(struct scatterlist *dst); unsigned int reqsize; struct crypto_tfm base; @@ -265,4 +266,21 @@ static inline int crypto_acomp_decompress(struct acomp_req *req) return crypto_acomp_reqtfm(req)->decompress(req); } +/** + * crypto_acomp_poll() -- Invoke asynchronous poll operation + * + * Function invokes the asynchronous poll operation + * + * @req: asynchronous request + * + * Return: zero on poll completion, -EAGAIN if not complete, or + * error code in case of error + */ +static inline int crypto_acomp_poll(struct acomp_req *req) +{ + struct crypto_acomp *tfm = crypto_acomp_reqtfm(req); + + return tfm->poll(req); +} + #endif diff --git a/include/crypto/internal/acompress.h b/include/crypto/internal/acompress.h index 8831edaafc05..fbf5f6c6eeb6 100644 --- a/include/crypto/internal/acompress.h +++ b/include/crypto/internal/acompress.h @@ -37,6 +37,7 @@ struct acomp_alg { int (*compress)(struct acomp_req *req); int (*decompress)(struct acomp_req *req); + int (*poll)(struct acomp_req *req); void (*dst_free)(struct scatterlist *dst); int (*init)(struct crypto_acomp *tfm); void (*exit)(struct crypto_acomp *tfm); From patchwork Fri Oct 18 06:40:50 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Sridhar, Kanchana P" X-Patchwork-Id: 13841229 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D3A64D3C551 for ; Fri, 18 Oct 2024 06:41:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B1C7D6B0088; Fri, 18 Oct 2024 02:41:08 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id ACD296B0089; Fri, 18 Oct 2024 02:41:08 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8F8176B008A; Fri, 18 Oct 2024 02:41:08 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 6C27E6B0088 for ; Fri, 18 Oct 2024 02:41:08 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id BFB83811E5 for ; Fri, 18 Oct 2024 06:40:57 +0000 (UTC) X-FDA: 82685775480.30.59D2584 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) by imf19.hostedemail.com (Postfix) with ESMTP id 66D741A000C for ; Fri, 18 Oct 2024 06:40:52 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=EuiHOFsh; spf=pass (imf19.hostedemail.com: domain of kanchana.p.sridhar@intel.com designates 192.198.163.15 as permitted sender) smtp.mailfrom=kanchana.p.sridhar@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729233519; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=3n7vBZ2YHqBQ19A05vmbe5pZrtJfejWOVervqy3sijA=; b=e22UtLizRz63bJisWRCoBG65mDdMkPtOlLqav922DVvhxvBE0SjrL9kSUszw8yWHiU9Mni 0yoZVRwljlB5nMS3Goy4QiMu48nK9LCx483imvE1osTb1h3XMpxBfGzUP+RMziZJfyv5dC cJmH6z1pCpCyK0DqM18qVdtoRvHN05o= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729233519; a=rsa-sha256; cv=none; b=sTaNjVCq9Ivi8FdhoWG1vGcdD+O7s+jZy3a0Abla/yxlNT9Q00+sbivtK4blyzKS8tHUb6 OZ+27OnMIOSIBHL8i0g5Cw3SXq+2Y3Jo1zNhuzqCpzsCs5NTyLVMfR33fC8vvzXXLa8q91 wDOssq9Og6PgoPW09QFJ9yxpPyp/Dgw= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=EuiHOFsh; spf=pass (imf19.hostedemail.com: domain of kanchana.p.sridhar@intel.com designates 192.198.163.15 as permitted sender) smtp.mailfrom=kanchana.p.sridhar@intel.com; dmarc=pass (policy=none) header.from=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1729233666; x=1760769666; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=LlOfKWFmMimgOZRHZFBWIeDtQDuVJjBoFm4a8ckUAcs=; b=EuiHOFsh64wnXr4Fcl2k9knPJAsdnOkM7ux1x8C9guvaOyOFC2yc8RpQ utxgSJb1fRKit2yE2CT9HuSU0QDvU3FBl3kwG9C7AI79H+nxFpseFao6H QHZ9SmXJ05r9PzqXCLW+LiVM9ba1Ovy5bAe5US19Mx1sd9niS+M0M1X/6 R5M+HZLTorNuk9gVMe/ljBODFyGtZ8R7q1tGvsv8YeIcJ6JTeKsFsuhMq UO7+sF+vqGmgn6Q9V8HCj4b04osZXQGyOeJ0G7uodVzBCc1Hyu4lMbszZ a+Qej7/g9+JvEiMmZn1AMGCaZFk54m5Krso915G2mWk3fLaEkt3EeHsKN Q==; X-CSE-ConnectionGUID: h5tFPtTaTce/fp0AUi1Xhw== X-CSE-MsgGUID: 8a6D1H2FS8qyhpp3dD48Tg== X-IronPort-AV: E=McAfee;i="6700,10204,11228"; a="28884798" X-IronPort-AV: E=Sophos;i="6.11,212,1725346800"; d="scan'208";a="28884798" Received: from orviesa003.jf.intel.com ([10.64.159.143]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Oct 2024 23:41:02 -0700 X-CSE-ConnectionGUID: gJKulEXVTRCuP5PYxKnngw== X-CSE-MsgGUID: S19yrmp8R/uCrtwKGYX3Rw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.11,199,1725346800"; d="scan'208";a="83607493" Received: from jf5300-b11a338t.jf.intel.com ([10.242.51.6]) by orviesa003.jf.intel.com with ESMTP; 17 Oct 2024 23:41:02 -0700 From: Kanchana P Sridhar To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, hannes@cmpxchg.org, yosryahmed@google.com, nphamcs@gmail.com, chengming.zhou@linux.dev, usamaarif642@gmail.com, ryan.roberts@arm.com, ying.huang@intel.com, 21cnbao@gmail.com, akpm@linux-foundation.org, linux-crypto@vger.kernel.org, herbert@gondor.apana.org.au, davem@davemloft.net, clabbe@baylibre.com, ardb@kernel.org, ebiggers@google.com, surenb@google.com, kristen.c.accardi@intel.com, zanussi@kernel.org, viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz, mcgrof@kernel.org, kees@kernel.org, joel.granados@kernel.org, bfoster@redhat.com, willy@infradead.org, linux-fsdevel@vger.kernel.org Cc: wajdi.k.feghali@intel.com, vinodh.gopal@intel.com, kanchana.p.sridhar@intel.com Subject: [RFC PATCH v1 02/13] crypto: iaa - Add support for irq-less crypto async interface Date: Thu, 17 Oct 2024 23:40:50 -0700 Message-Id: <20241018064101.336232-3-kanchana.p.sridhar@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20241018064101.336232-1-kanchana.p.sridhar@intel.com> References: <20241018064101.336232-1-kanchana.p.sridhar@intel.com> MIME-Version: 1.0 X-Stat-Signature: 7af3fx6wkq1ndbc9n47tr6w3efumkqce X-Rspamd-Queue-Id: 66D741A000C X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1729233652-759413 X-HE-Meta: U2FsdGVkX19WFfbzgJdPWjDs4nHTOoqONcZJPheQlCQ9Je6qwRy9+DbiDPaY0LMNhaSSujT5Q338KWpoPgQPvetarutZpExjXov6R0Oud2S3UocrZvvUEjqVh6zG2lkop8Ng7B0+6y0OhvBeqKRzTqYAkf9HeBRPjcLfwbGUstMww+i8WFFlCIFfNclZG5rkSq0+u8tPMvg3+RjuEhblEPC8q1rTZ8/LlJj0NDeWUcd0HOoNsYNoqdHV85fvoRMkHsXv6nQ9tTEZjgZ4qD5ogx4NS1k2guipl5tvCQQ63B0//1w11fE6XaQPkP0WNWlxC8jhejQZnd0PcCfnV4b4T+Stw8tVpuh00gDS0qOMkyX9cOJ5Q5dOVnEldNQP28KGTJI3Z8KE6Bz3wuUfYmPYBRhMK/T8rS6NyF1agk80scu+hEaROzwpEkydt+zCwt/I1s0TfPXTGEgwL4JeSOPTZ9c2S7OITKJRHoImTxUB0zjcH33YnAGmMf/rP5XSs4g+1kV3fmWUXovkuMC259y98V8VZoDXB8399yK3T0EsrrUkHb5hLsLJVyMeUdfjNxljvxfJ1ZWZShspEgEoISDvQk9iskg4H+7uany2KhfrXhVVwupGvlwQXpdcgVin8ejM1+HQZXM6CE8bdMkxkjgIw/54xPUJPrK8T8crwUjwPkirxtQH7F9mqcJyFT6rgvoM7bvE+zmWYOU7FVAgbzKQWGOgTGNtBWOYYLEc87o/fABu9kKFubUn6JkHUL7q6R0RBlS3l5P20EhsghNuZWrvkjFNHWc3h3c7zvJyTcaYOyB8t+7PNljrCs2ysc+UW5J0FpCMt2uKDHHddo8giehztt/kwgHUoyER+fpb2sFuvl9iHMssFHSFAHy1dLDUA3rQOBuOc4W0xhJ8PHHsi5ENkD8kY4MUi3LiKE1594/o7Gaj/+/14eE0mFQQdYJwUOooi8NSR7NxkFCoh7jSYS5 CSJzcx5R nf1ADfE5uSQAjEIe9Unfk48LF9vS+XA7cfyDCyRQJsE3RdMpmcCfXgUpBa8Elh7uCyC/WiLKWMCsJKi2Nq7qRWPU3qifCZNGiVBhLXAK8CKffmhdI7Gqi82VKm6mlM8/Oc8sNsAApL7II2JX3Jb7v/4ibQdLMYOfUTyV5JMPIKqFGt7TYrqSE4jLQbA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add a crypto acomp poll() implementation so that callers can use true async iaa compress/decompress without interrupts. To use this mode with zswap, select the 'async' iaa_crypto driver_sync_mode: echo async > /sys/bus/dsa/drivers/crypto/sync_mode This will cause the iaa_crypto driver to register its acomp_alg implementation using a non-NULL poll() member, which callers such as zswap can check for the presence of and use true async mode if found. Signed-off-by: Tom Zanussi Signed-off-by: Kanchana P Sridhar --- drivers/crypto/intel/iaa/iaa_crypto_main.c | 74 ++++++++++++++++++++++ 1 file changed, 74 insertions(+) diff --git a/drivers/crypto/intel/iaa/iaa_crypto_main.c b/drivers/crypto/intel/iaa/iaa_crypto_main.c index 237f87000070..6a8577ac1330 100644 --- a/drivers/crypto/intel/iaa/iaa_crypto_main.c +++ b/drivers/crypto/intel/iaa/iaa_crypto_main.c @@ -1788,6 +1788,74 @@ static void compression_ctx_init(struct iaa_compression_ctx *ctx) ctx->use_irq = use_irq; } +static int iaa_comp_poll(struct acomp_req *req) +{ + struct idxd_desc *idxd_desc; + struct idxd_device *idxd; + struct iaa_wq *iaa_wq; + struct pci_dev *pdev; + struct device *dev; + struct idxd_wq *wq; + bool compress_op; + int ret; + + idxd_desc = req->base.data; + if (!idxd_desc) + return -EAGAIN; + + compress_op = (idxd_desc->iax_hw->opcode == IAX_OPCODE_COMPRESS); + wq = idxd_desc->wq; + iaa_wq = idxd_wq_get_private(wq); + idxd = iaa_wq->iaa_device->idxd; + pdev = idxd->pdev; + dev = &pdev->dev; + + ret = check_completion(dev, idxd_desc->iax_completion, true, true); + if (ret == -EAGAIN) + return ret; + if (ret) + goto out; + + req->dlen = idxd_desc->iax_completion->output_size; + + /* Update stats */ + if (compress_op) { + update_total_comp_bytes_out(req->dlen); + update_wq_comp_bytes(wq, req->dlen); + } else { + update_total_decomp_bytes_in(req->slen); + update_wq_decomp_bytes(wq, req->slen); + } + + if (iaa_verify_compress && (idxd_desc->iax_hw->opcode == IAX_OPCODE_COMPRESS)) { + struct crypto_tfm *tfm = req->base.tfm; + dma_addr_t src_addr, dst_addr; + u32 compression_crc; + + compression_crc = idxd_desc->iax_completion->crc; + + dma_sync_sg_for_device(dev, req->dst, 1, DMA_FROM_DEVICE); + dma_sync_sg_for_device(dev, req->src, 1, DMA_TO_DEVICE); + + src_addr = sg_dma_address(req->src); + dst_addr = sg_dma_address(req->dst); + + ret = iaa_compress_verify(tfm, req, wq, src_addr, req->slen, + dst_addr, &req->dlen, compression_crc); + } +out: + /* caller doesn't call crypto_wait_req, so no acomp_request_complete() */ + + dma_unmap_sg(dev, req->dst, sg_nents(req->dst), DMA_FROM_DEVICE); + dma_unmap_sg(dev, req->src, sg_nents(req->src), DMA_TO_DEVICE); + + idxd_free_desc(idxd_desc->wq, idxd_desc); + + dev_dbg(dev, "%s: returning ret=%d\n", __func__, ret); + + return ret; +} + static int iaa_comp_init_fixed(struct crypto_acomp *acomp_tfm) { struct crypto_tfm *tfm = crypto_acomp_tfm(acomp_tfm); @@ -1813,6 +1881,7 @@ static struct acomp_alg iaa_acomp_fixed_deflate = { .compress = iaa_comp_acompress, .decompress = iaa_comp_adecompress, .dst_free = dst_free, + .poll = iaa_comp_poll, .base = { .cra_name = "deflate", .cra_driver_name = "deflate-iaa", @@ -1827,6 +1896,11 @@ static int iaa_register_compression_device(void) { int ret; + if (async_mode && !use_irq) + iaa_acomp_fixed_deflate.poll = iaa_comp_poll; + else + iaa_acomp_fixed_deflate.poll = NULL; + ret = crypto_register_acomp(&iaa_acomp_fixed_deflate); if (ret) { pr_err("deflate algorithm acomp fixed registration failed (%d)\n", ret); From patchwork Fri Oct 18 06:40:51 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Sridhar, Kanchana P" X-Patchwork-Id: 13841231 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 86CFCD3C551 for ; Fri, 18 Oct 2024 06:41:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A362D6B008A; Fri, 18 Oct 2024 02:41:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8FFC66B0095; Fri, 18 Oct 2024 02:41:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 575FD6B008C; Fri, 18 Oct 2024 02:41:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 2E6CD6B0089 for ; Fri, 18 Oct 2024 02:41:09 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id A216716183F for ; Fri, 18 Oct 2024 06:40:55 +0000 (UTC) X-FDA: 82685775648.11.106AB11 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) by imf24.hostedemail.com (Postfix) with ESMTP id 8AE8518000C for ; Fri, 18 Oct 2024 06:41:04 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=nCEK20PB; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf24.hostedemail.com: domain of kanchana.p.sridhar@intel.com designates 192.198.163.15 as permitted sender) smtp.mailfrom=kanchana.p.sridhar@intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729233559; a=rsa-sha256; cv=none; b=qcxwqMd+VvjDRjDKiTkdxnGeYoeV4w6gT/oue46hwYOfqw5N3lmsCaxsJx1BmwdI3ELM34 OJ0JxWy2KGKDSiejXCrsgwYJ1BYdtZ/FGYV7zQKc3DhPLLQTgHvopkWBDhVRCJgEa6GTyb zY/mylUUHL9qPuCd9iHCeQH+nBfE9vg= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=nCEK20PB; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf24.hostedemail.com: domain of kanchana.p.sridhar@intel.com designates 192.198.163.15 as permitted sender) smtp.mailfrom=kanchana.p.sridhar@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729233559; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=/Y/NkeDIxfDZF+CxUEAyYeR12VREGHI06dwspDGrOWw=; b=EDZLupipreoZJ8YVjzx59gWWqbSjTjkWbiC7mNqkhFwjV5g5cnuxt1tQrAIbQqL/KRh05s yR4cgyW3TocLuNcZZ6mcu67gAY2kj//l9d939IkH8nJ6rckdAp5PLCJQ1jGyn4b07yNs8b EBmlpe3h6ljm2enR2MuJgmZjH+2tP50= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1729233667; x=1760769667; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=+wwI3G60HC02QVcBH/92bWn6u+fpHdMtlQxgJjc/fNE=; b=nCEK20PB9HxSGk3NItTgalE+oRj2mRYiK8mbb7NM9IzkWnZ5j8W5Rjv2 HjjiM97gYEFjbmBds6kKJ5grFG68aYDqaIWwcdO2FOBVClaz2alNhjEDt 8ovi0fv8oqZZohV9XaxJd8TzuFL22aslvw3R4v4+llP/Xmronxs5+hJlM LOA7cHjg1pHnfnlS8atOmBn/b3qLIWc76VFQbXQoQhnj2WJbPzyYd8EoQ eTBoOD/qX9k7xVdOf2WqR9P+0Qqq9waFPhBWey60sv2TgreGttw7fu1/a EszPmFkHzyK9VzEntZG8cZf9KcKj8R4nzAuYvy2bRBMeri4KhUoX21/6i Q==; X-CSE-ConnectionGUID: XuXtTK0uRfuDd10lDzfxDw== X-CSE-MsgGUID: h6uAsDivSWSEQNxCPLotXA== X-IronPort-AV: E=McAfee;i="6700,10204,11228"; a="28884812" X-IronPort-AV: E=Sophos;i="6.11,212,1725346800"; d="scan'208";a="28884812" Received: from orviesa003.jf.intel.com ([10.64.159.143]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Oct 2024 23:41:02 -0700 X-CSE-ConnectionGUID: bd2/xxwGQ2WQ6ujF1GVEZA== X-CSE-MsgGUID: luRQ3pv+QyWsohgkCEdd8A== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.11,199,1725346800"; d="scan'208";a="83607497" Received: from jf5300-b11a338t.jf.intel.com ([10.242.51.6]) by orviesa003.jf.intel.com with ESMTP; 17 Oct 2024 23:41:02 -0700 From: Kanchana P Sridhar To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, hannes@cmpxchg.org, yosryahmed@google.com, nphamcs@gmail.com, chengming.zhou@linux.dev, usamaarif642@gmail.com, ryan.roberts@arm.com, ying.huang@intel.com, 21cnbao@gmail.com, akpm@linux-foundation.org, linux-crypto@vger.kernel.org, herbert@gondor.apana.org.au, davem@davemloft.net, clabbe@baylibre.com, ardb@kernel.org, ebiggers@google.com, surenb@google.com, kristen.c.accardi@intel.com, zanussi@kernel.org, viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz, mcgrof@kernel.org, kees@kernel.org, joel.granados@kernel.org, bfoster@redhat.com, willy@infradead.org, linux-fsdevel@vger.kernel.org Cc: wajdi.k.feghali@intel.com, vinodh.gopal@intel.com, kanchana.p.sridhar@intel.com Subject: [RFC PATCH v1 03/13] crypto: testmgr - Add crypto testmgr acomp poll support. Date: Thu, 17 Oct 2024 23:40:51 -0700 Message-Id: <20241018064101.336232-4-kanchana.p.sridhar@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20241018064101.336232-1-kanchana.p.sridhar@intel.com> References: <20241018064101.336232-1-kanchana.p.sridhar@intel.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 8AE8518000C X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: 9m7wjyfcds3eweeojnbtxdya6y9dwzqu X-HE-Tag: 1729233664-270688 X-HE-Meta: U2FsdGVkX19zCi+4x/RarSj+YZOyYR83NUqgYXDBfs+oirwIlol7t72m7+CyVlIz0u2/eUkF8DEVOIDszhWISjDFXowwWVJw16lMUnIOsvxHHR9E76nVfmPnKnfXRl4/7Pn27aroycttpUXNdGmKkn8ygN8I8+MjkU07DKhlhak5e/q7uL9d1M7QB7rN2n5SD2vIsOciL5ZGMxgBMP17w2dwg6sIYKDqV7r9rHzFfykKkKuStzTveGwHGRS/3B2s0gva0cKhkUVZ2MifpQ54XIkRxMZ23R8CkG8qRU2HpzETpiE761F5WOEAB42p4Mxy5SWKa6ZY5BqsEIhS+YuN2OeMsW3QFtGzhwuRiWZcWH03KtNq1PLYgLb0zELjsWEnLYncMnikeWuHTQSbaKwEcw4MZXIjlLLr2xzMzT1Pp4dZY1P0qmcIF7HHjSjJs0GgFUZttnQepxbOBQOnB8vTi4kYTz4CVDIIGMarPydh6B20fQfMAhVuERa0qpRd704ELyvXxd2chJVdKMpqQ1a67is+eafN8wT33x/44qeUPYM56kvhZWznAOlMlMiH1Nnlil3LUj1gb/pYBzQpJATU9lWQTKkB6X+FtdVRZEIpEtnY/3/tHPx0HADmF+RaI34Y+8iySuqYwsCulXf7H0dROHCynl3+cbXNIWVF1DlTARuFxHmeDxVt8JBGAT7rcqd2FdITsg5zg0WQNu84lIkEW/IUUZ+TsNPhRrODKENVBOyq8/P1ozH/GuJ8uWEAlfz3sIMQBjm/GqvaC5HPHS1NfS3vbuRm/St/eNsMEx9H7BIuYECvWqd1XxXc/IPFWIkaFMa934eodYTsfVew3jQc1Kg805i9krm0sVC8Q6Orn9lF/3JCTgrLLlKft3yMoo7y6k3x452ndwuyI33jW29gyn3vxENC7llsnnDN8u//BVuln5CFXun6H3LGM2+JBjjU/hbRy206RHuer/pNA+r T5vsuXW7 OJSE/WTcaSJHfmReGFdtjdl0lPHmCZqSzYSe2MyeDMBWsVlegQDI2qNrFHcw2NaywHqgQUheGn0wQ/u4HYaaQ61/UWT6XOJct/an3Z6C/jTl0HyleNpSYFy15OCa24mCznCxPMMs2MvGjO9nqRT1eiCNGarlXTudut/1FgDtttbU0aHEze26Cki4Frg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This patch enables the newly added acomp poll API to be exercised in the crypto test_acomp() calls to compress/decompress, if the acomp registers a poll method. Signed-off-by: Glover, Andre Signed-off-by: Kanchana P Sridhar --- crypto/testmgr.c | 70 ++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 65 insertions(+), 5 deletions(-) diff --git a/crypto/testmgr.c b/crypto/testmgr.c index ee8da628e9da..54f6f59ae501 100644 --- a/crypto/testmgr.c +++ b/crypto/testmgr.c @@ -3482,7 +3482,19 @@ static int test_acomp(struct crypto_acomp *tfm, acomp_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG, crypto_req_done, &wait); - ret = crypto_wait_req(crypto_acomp_compress(req), &wait); + if (tfm->poll) { + ret = crypto_acomp_compress(req); + if (ret == -EINPROGRESS) { + do { + ret = crypto_acomp_poll(req); + if (ret && ret != -EAGAIN) + break; + } while (ret); + } + } else { + ret = crypto_wait_req(crypto_acomp_compress(req), &wait); + } + if (ret) { pr_err("alg: acomp: compression failed on test %d for %s: ret=%d\n", i + 1, algo, -ret); @@ -3498,7 +3510,19 @@ static int test_acomp(struct crypto_acomp *tfm, crypto_init_wait(&wait); acomp_request_set_params(req, &src, &dst, ilen, dlen); - ret = crypto_wait_req(crypto_acomp_decompress(req), &wait); + if (tfm->poll) { + ret = crypto_acomp_decompress(req); + if (ret == -EINPROGRESS) { + do { + ret = crypto_acomp_poll(req); + if (ret && ret != -EAGAIN) + break; + } while (ret); + } + } else { + ret = crypto_wait_req(crypto_acomp_decompress(req), &wait); + } + if (ret) { pr_err("alg: acomp: compression failed on test %d for %s: ret=%d\n", i + 1, algo, -ret); @@ -3531,7 +3555,19 @@ static int test_acomp(struct crypto_acomp *tfm, sg_init_one(&src, input_vec, ilen); acomp_request_set_params(req, &src, NULL, ilen, 0); - ret = crypto_wait_req(crypto_acomp_compress(req), &wait); + if (tfm->poll) { + ret = crypto_acomp_compress(req); + if (ret == -EINPROGRESS) { + do { + ret = crypto_acomp_poll(req); + if (ret && ret != -EAGAIN) + break; + } while (ret); + } + } else { + ret = crypto_wait_req(crypto_acomp_compress(req), &wait); + } + if (ret) { pr_err("alg: acomp: compression failed on NULL dst buffer test %d for %s: ret=%d\n", i + 1, algo, -ret); @@ -3574,7 +3610,19 @@ static int test_acomp(struct crypto_acomp *tfm, acomp_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG, crypto_req_done, &wait); - ret = crypto_wait_req(crypto_acomp_decompress(req), &wait); + if (tfm->poll) { + ret = crypto_acomp_decompress(req); + if (ret == -EINPROGRESS) { + do { + ret = crypto_acomp_poll(req); + if (ret && ret != -EAGAIN) + break; + } while (ret); + } + } else { + ret = crypto_wait_req(crypto_acomp_decompress(req), &wait); + } + if (ret) { pr_err("alg: acomp: decompression failed on test %d for %s: ret=%d\n", i + 1, algo, -ret); @@ -3606,7 +3654,19 @@ static int test_acomp(struct crypto_acomp *tfm, crypto_init_wait(&wait); acomp_request_set_params(req, &src, NULL, ilen, 0); - ret = crypto_wait_req(crypto_acomp_decompress(req), &wait); + if (tfm->poll) { + ret = crypto_acomp_decompress(req); + if (ret == -EINPROGRESS) { + do { + ret = crypto_acomp_poll(req); + if (ret && ret != -EAGAIN) + break; + } while (ret); + } + } else { + ret = crypto_wait_req(crypto_acomp_decompress(req), &wait); + } + if (ret) { pr_err("alg: acomp: decompression failed on NULL dst buffer test %d for %s: ret=%d\n", i + 1, algo, -ret); From patchwork Fri Oct 18 06:40:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Sridhar, Kanchana P" X-Patchwork-Id: 13841230 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 31AC8D3C54C for ; Fri, 18 Oct 2024 06:41:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6DA836B0089; Fri, 18 Oct 2024 02:41:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6AD746B008A; Fri, 18 Oct 2024 02:41:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4B2606B0093; Fri, 18 Oct 2024 02:41:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 2EDF16B008A for ; Fri, 18 Oct 2024 02:41:09 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id B01B8811E5 for ; Fri, 18 Oct 2024 06:40:58 +0000 (UTC) X-FDA: 82685775732.14.D1687B2 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) by imf27.hostedemail.com (Postfix) with ESMTP id 0283A40005 for ; Fri, 18 Oct 2024 06:40:55 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=Mb2gvK0R; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf27.hostedemail.com: domain of kanchana.p.sridhar@intel.com designates 192.198.163.15 as permitted sender) smtp.mailfrom=kanchana.p.sridhar@intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729233548; a=rsa-sha256; cv=none; b=Q0JL4qtjNPRW4VQnZa3Z0vO4B0pRNNQeFB4D9wRD932gEHBE4NfmOoal3l9awvkgV50pHn wIofMXjsO0WIGlz2jmdNTFDlWNRmnc9YW6aH3XXge4xGbJ53wW7M+HyymZspCipPNonFc6 dQWVL87QdsLFTgNUhmV4wUztRpj2deE= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=Mb2gvK0R; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf27.hostedemail.com: domain of kanchana.p.sridhar@intel.com designates 192.198.163.15 as permitted sender) smtp.mailfrom=kanchana.p.sridhar@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729233548; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ojZCB3+cDbZulKZCRnF9Fc4yCfp7jxoYD47wlxBQ5kU=; b=6bzDlKDodjnMvtxFOQwoRcqPyIdIBA9m1X97O3LdXr/tOwWNHIxZOGoJh1Y+z/QXVt8Y8H pnypb1ElKC6tV1w/seIZWFT+G4Gdnxy0efdFTVK8pl89l/Tc7XRaKQcRnOcAKby0B4Q86X Py14bCzZGvxg/8n0aE8l2gJMDZTDCgM= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1729233667; x=1760769667; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=gtInCloZyEBkd8uVBGAQIApykz7xeQMHHy5TXu53KaI=; b=Mb2gvK0RjAvMkQzGeLKoxIdurn4Db89JuGnLPssvKRCqJ4G/CHCdzNO0 n3/e8j+WgZ+wZ84KfOpvrh5kOjUc0Y41AN6a+cqOjebGWOhgvUxWwy3IT doAxYjqOHZqoXdTO7jW+yEFHhDyRS3T9uxdQzoA+rIzKxK5YUexH6S/xa 87GiE/daa3E1N5IZNjVyLB+Aop3M0M8BNu7nAWGivv4rEnfDKFtTKCIkb 2ED2ggUp5gt36rjjusvuKqfRRUTMybm9N28jRcjtidzMjt1iocbfxh4hQ squ3/9Jl/BICov7srpf8I/M9I+U1sdwRhDX4+fjaZ706I4cwPQXCM1G4k w==; X-CSE-ConnectionGUID: +WOUhZApQjy8hY9uDhw6eQ== X-CSE-MsgGUID: 9rR1rEHxQ2ymNtP4LlfqWw== X-IronPort-AV: E=McAfee;i="6700,10204,11228"; a="28884827" X-IronPort-AV: E=Sophos;i="6.11,212,1725346800"; d="scan'208";a="28884827" Received: from orviesa003.jf.intel.com ([10.64.159.143]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Oct 2024 23:41:02 -0700 X-CSE-ConnectionGUID: S5P7Z3aMRMezS5ESvvkKGg== X-CSE-MsgGUID: 95MDyyYsR9OFLjtcK6GVSw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.11,199,1725346800"; d="scan'208";a="83607500" Received: from jf5300-b11a338t.jf.intel.com ([10.242.51.6]) by orviesa003.jf.intel.com with ESMTP; 17 Oct 2024 23:41:02 -0700 From: Kanchana P Sridhar To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, hannes@cmpxchg.org, yosryahmed@google.com, nphamcs@gmail.com, chengming.zhou@linux.dev, usamaarif642@gmail.com, ryan.roberts@arm.com, ying.huang@intel.com, 21cnbao@gmail.com, akpm@linux-foundation.org, linux-crypto@vger.kernel.org, herbert@gondor.apana.org.au, davem@davemloft.net, clabbe@baylibre.com, ardb@kernel.org, ebiggers@google.com, surenb@google.com, kristen.c.accardi@intel.com, zanussi@kernel.org, viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz, mcgrof@kernel.org, kees@kernel.org, joel.granados@kernel.org, bfoster@redhat.com, willy@infradead.org, linux-fsdevel@vger.kernel.org Cc: wajdi.k.feghali@intel.com, vinodh.gopal@intel.com, kanchana.p.sridhar@intel.com Subject: [RFC PATCH v1 04/13] mm: zswap: zswap_compress()/decompress() can submit, then poll an acomp_req. Date: Thu, 17 Oct 2024 23:40:52 -0700 Message-Id: <20241018064101.336232-5-kanchana.p.sridhar@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20241018064101.336232-1-kanchana.p.sridhar@intel.com> References: <20241018064101.336232-1-kanchana.p.sridhar@intel.com> MIME-Version: 1.0 X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 0283A40005 X-Stat-Signature: 8cr9h6uuntr3ruutah5zkuirdji6y8x1 X-Rspam-User: X-HE-Tag: 1729233655-48885 X-HE-Meta: U2FsdGVkX1+1+LpEm9Zq0T9bRmN3gvvEDhkmhUlNpOepcrDfJaL7FfIXaEOAlvO7XgYXDt9vV8YEiRvqqQ4VH/jPtjLrilsTWfLC1Qc8NiFgTDSWNHyt/HL81H2Y4UjGzUzkS34gQ7bFZc9uL4/2wOk8Gq6aq8T7CnoY7iaSqcw6pGPjn+Izf8KapxJ9tSZXj3b7deHfU0Qu5XfIWF6tO7zoxYBmCgyssd8WWv/bkidR4wtT3Y1QfpTtivE2XeGdQxKxVd/bV2ySuUq7Sw99CyyRvAedRGq1S/yrPT+P8RYvSLrCusdIaMrDpp8827T+de1a1DbaaJd0RblU1qt9ghQPKG6n3X8Ri+e4JBXjJ5k5JWXZg7BjsKKSPjVFm2YD9cxWe6kMFjmL+h2Dk5YaRqk87gU9WITYBOWWC5xffLfcZ15SzV0FBnhDnkZKT8uWHYQ2069NILxVKKG9nwo4Oy0ETebzpXTuTxJg0Ub+YcNmYLFftx8tKSlPr2+/3v6Gv8o7wk+m2KjzSQFMPFkf8QSABhDuXtFdO6YPagagthYp/ykr/IUutum+PQUf8zHuYqx5m3Uo83RZBMsd51qmz2StQkxuPIuNLXMLhpMjLUy2Yhqy6os9iXef1VjtnyWuBsn25+0savOhNbCJvhULO9hNgYzP3GzNmVsEDacEYgphWndErqUyiF2NAUREX0ScwJRnas85xr/2TeAFZcRa4DoWB+XY4QAQk+b7I2tbDOQuZ8dnsK1IoavH9eA80KCglpHjoha+3+waNJrbvVdpRdBSB0Nxmf3jS2bTvtHsvZFBe7J5qC/dfX3TUv9hJ+AHIT8PXHsDjMOwHkdXcH/3mMklKb7iL0aG2n7ILa3o+mb+1ei3y0NHBxWGhGwn72KmorT03NeoFDc1DYR4P70poPcFtPlmsqlkXLOV9VC34D9DH1QikZsLvoIQnpWcWGyXQvy64Ccenu22V/JgpZk n0T8bG5s c/uzZjBCos62RBRvN7lT6zjoxGjFbz7wirJEFV/dW+n/nCS5P/pyKkinabKOYYd2t25FJF3JijQtVIlZA6pFUdJoz8rqAtfDmnoJMqRB8QJGjwGGtgT0dHY2WUcBWH2QQhjAq+Hhq6o2/Wlw4XBLzSmTCzRaHcCF3S/LxL47Mm+BMtpXXm2yXlfOWVAg55C1KnclYEYIklM7vOx3EEzXBMnmokVkEpIrQPHh4F4T5yJH8du0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: If the crypto_acomp has a poll interface registered, zswap_compress() and zswap_decompress() will submit the acomp_req, and then poll() for a successful completion/error status in a busy-wait loop. This allows an asynchronous way to manage (potentially multiple) acomp_reqs without the use of interrupts, which is supported in the iaa_crypto driver. This enables us to implement batch submission of multiple compression/decompression jobs to the Intel IAA hardware accelerator, which will process them in parallel; followed by polling the batch's acomp_reqs for completion status. Signed-off-by: Kanchana P Sridhar --- mm/zswap.c | 51 +++++++++++++++++++++++++++++++++++++++------------ 1 file changed, 39 insertions(+), 12 deletions(-) diff --git a/mm/zswap.c b/mm/zswap.c index f6316b66fb23..948c9745ee57 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -910,18 +910,34 @@ static bool zswap_compress(struct page *page, struct zswap_entry *entry, acomp_request_set_params(acomp_ctx->req, &input, &output, PAGE_SIZE, dlen); /* - * it maybe looks a little bit silly that we send an asynchronous request, - * then wait for its completion synchronously. This makes the process look - * synchronous in fact. - * Theoretically, acomp supports users send multiple acomp requests in one - * acomp instance, then get those requests done simultaneously. but in this - * case, zswap actually does store and load page by page, there is no - * existing method to send the second page before the first page is done - * in one thread doing zwap. - * but in different threads running on different cpu, we have different - * acomp instance, so multiple threads can do (de)compression in parallel. + * If the crypto_acomp provides an asynchronous poll() interface, + * submit the descriptor and poll for a completion status. + * + * It maybe looks a little bit silly that we send an asynchronous + * request, then wait for its completion in a busy-wait poll loop, or, + * synchronously. This makes the process look synchronous in fact. + * Theoretically, acomp supports users send multiple acomp requests in + * one acomp instance, then get those requests done simultaneously. + * But in this case, zswap actually does store and load page by page, + * there is no existing method to send the second page before the + * first page is done in one thread doing zswap. + * But in different threads running on different cpu, we have different + * acomp instance, so multiple threads can do (de)compression in + * parallel. */ - comp_ret = crypto_wait_req(crypto_acomp_compress(acomp_ctx->req), &acomp_ctx->wait); + if (acomp_ctx->acomp->poll) { + comp_ret = crypto_acomp_compress(acomp_ctx->req); + if (comp_ret == -EINPROGRESS) { + do { + comp_ret = crypto_acomp_poll(acomp_ctx->req); + if (comp_ret && comp_ret != -EAGAIN) + break; + } while (comp_ret); + } + } else { + comp_ret = crypto_wait_req(crypto_acomp_compress(acomp_ctx->req), &acomp_ctx->wait); + } + dlen = acomp_ctx->req->dlen; if (comp_ret) goto unlock; @@ -959,6 +975,7 @@ static void zswap_decompress(struct zswap_entry *entry, struct folio *folio) struct scatterlist input, output; struct crypto_acomp_ctx *acomp_ctx; u8 *src; + int ret; acomp_ctx = raw_cpu_ptr(entry->pool->acomp_ctx); mutex_lock(&acomp_ctx->mutex); @@ -984,7 +1001,17 @@ static void zswap_decompress(struct zswap_entry *entry, struct folio *folio) sg_init_table(&output, 1); sg_set_folio(&output, folio, PAGE_SIZE, 0); acomp_request_set_params(acomp_ctx->req, &input, &output, entry->length, PAGE_SIZE); - BUG_ON(crypto_wait_req(crypto_acomp_decompress(acomp_ctx->req), &acomp_ctx->wait)); + if (acomp_ctx->acomp->poll) { + ret = crypto_acomp_decompress(acomp_ctx->req); + if (ret == -EINPROGRESS) { + do { + ret = crypto_acomp_poll(acomp_ctx->req); + BUG_ON(ret && ret != -EAGAIN); + } while (ret); + } + } else { + BUG_ON(crypto_wait_req(crypto_acomp_decompress(acomp_ctx->req), &acomp_ctx->wait)); + } BUG_ON(acomp_ctx->req->dlen != PAGE_SIZE); mutex_unlock(&acomp_ctx->mutex); From patchwork Fri Oct 18 06:40:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Sridhar, Kanchana P" X-Patchwork-Id: 13841232 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C86EFD3C550 for ; Fri, 18 Oct 2024 06:41:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A81676B0093; Fri, 18 Oct 2024 02:41:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A31AC6B0095; Fri, 18 Oct 2024 02:41:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 85D156B0096; Fri, 18 Oct 2024 02:41:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 634B86B0093 for ; Fri, 18 Oct 2024 02:41:10 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id D12F0AB378 for ; Fri, 18 Oct 2024 06:40:46 +0000 (UTC) X-FDA: 82685775690.11.5817867 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) by imf19.hostedemail.com (Postfix) with ESMTP id AA9041A000C for ; Fri, 18 Oct 2024 06:40:54 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=R2xeUpSa; spf=pass (imf19.hostedemail.com: domain of kanchana.p.sridhar@intel.com designates 192.198.163.15 as permitted sender) smtp.mailfrom=kanchana.p.sridhar@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729233521; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Bluf7mu5dGh0ItEEG/jNQhxGtfJbo3k5gaIfftwacYA=; b=KKXUo/lVQUHdYQUXbbJbKApm9dqREO/bJaKHY+EETbQJTxmCOtQe/KYnW1AR3fSWpRBhWi ub9Dq16dBIX18b7KFms132dFIwFRISFB2ukoRr64VKcQdoFH/OCSL0h+X12+h3BrFg+Zs0 E5mRL0r0JcaLWj92SNuGny0dhpgmVAM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729233521; a=rsa-sha256; cv=none; b=MHdJJX6h4tq/2nj7/NTY7MSJojCO0ww4pRmFv9BBg2FyfWQGrI0RZgUSchceeXa53wv5fj DfRMMQLDFc5Qy/fKObvdOKHuf6kSeNsVdKBfoRnvo0uLIfiZ6rEAhgW6U+zcRJ6EM4jloZ uHax5EV4BUMlJxIJ4kWFvG+11MNSLqA= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=R2xeUpSa; spf=pass (imf19.hostedemail.com: domain of kanchana.p.sridhar@intel.com designates 192.198.163.15 as permitted sender) smtp.mailfrom=kanchana.p.sridhar@intel.com; dmarc=pass (policy=none) header.from=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1729233668; x=1760769668; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Z1uLVHFrCxBVRppWSz2uPPFIJpDocNdlo79Vr4aRt6E=; b=R2xeUpSaRiT95YqgpF2+YmZOCC65FXhX+FMgh1qP/DqOUcosZ8AH+zC0 fzBTcfITIT0DQMe9YjB/B1eBUN+sWCKREZhSPVe8e026VvpY+TCx/A0q3 8Fxqqm2kZfqhw/C3L6SJnnATLX1IHyLqrB1m0oK7QG8gGhDwCwcGqY3TE lSN5wVwwZUm6loK+KXXmRj6MCxCmtcgwzeADGcwOoaYjTX+0G/vaRpS9+ yfy04aDBiCc/8aPkyX2DWA5BgKUR5y0S/4H/eYYU5INmYgX/MZqSM0pXA GRklOQLjXJ0007Yf415PETBomJx9om3KVyapf7x7YxapPZM0TtX9AoRDs A==; X-CSE-ConnectionGUID: YWqC/V7kS9OyscpYVVIaKA== X-CSE-MsgGUID: J6F5f+2ARYqZ3uBkkRtwwg== X-IronPort-AV: E=McAfee;i="6700,10204,11228"; a="28884842" X-IronPort-AV: E=Sophos;i="6.11,212,1725346800"; d="scan'208";a="28884842" Received: from orviesa003.jf.intel.com ([10.64.159.143]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Oct 2024 23:41:02 -0700 X-CSE-ConnectionGUID: BcOBrWNBQuGJzmoGtHi9jQ== X-CSE-MsgGUID: tf9TGtY1SRSPfpX2G7ugmw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.11,199,1725346800"; d="scan'208";a="83607504" Received: from jf5300-b11a338t.jf.intel.com ([10.242.51.6]) by orviesa003.jf.intel.com with ESMTP; 17 Oct 2024 23:41:02 -0700 From: Kanchana P Sridhar To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, hannes@cmpxchg.org, yosryahmed@google.com, nphamcs@gmail.com, chengming.zhou@linux.dev, usamaarif642@gmail.com, ryan.roberts@arm.com, ying.huang@intel.com, 21cnbao@gmail.com, akpm@linux-foundation.org, linux-crypto@vger.kernel.org, herbert@gondor.apana.org.au, davem@davemloft.net, clabbe@baylibre.com, ardb@kernel.org, ebiggers@google.com, surenb@google.com, kristen.c.accardi@intel.com, zanussi@kernel.org, viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz, mcgrof@kernel.org, kees@kernel.org, joel.granados@kernel.org, bfoster@redhat.com, willy@infradead.org, linux-fsdevel@vger.kernel.org Cc: wajdi.k.feghali@intel.com, vinodh.gopal@intel.com, kanchana.p.sridhar@intel.com Subject: [RFC PATCH v1 05/13] crypto: iaa - Make async mode the default. Date: Thu, 17 Oct 2024 23:40:53 -0700 Message-Id: <20241018064101.336232-6-kanchana.p.sridhar@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20241018064101.336232-1-kanchana.p.sridhar@intel.com> References: <20241018064101.336232-1-kanchana.p.sridhar@intel.com> MIME-Version: 1.0 X-Stat-Signature: ir5xmrdzfb49kgeyjuhiwyruqjryjzbj X-Rspamd-Queue-Id: AA9041A000C X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1729233654-142561 X-HE-Meta: U2FsdGVkX1/WgUvsIzGhiTuByK6q5a8gtLsZnx/rV5JqgMSwyDWbgW+HD+CHEX2BT/1cBFZAFF3+Eb+vJfdlP5YkwoKIBFjZNZyyum0rjr0M8/kIHjr5IvbnlpghPjJkw7xMBJzuJ0+ciLK2B6R7cfXfx8PoMRJS2ALH444Mzmyem0pmKqNORVppgnMmdvfDXRvjqmegKqwSuOiKxX5kfa6arrpdoUw3xXqbni30dpFfuQnBamHscVLFPr/ML9TBIfFFuuIhMa6S3hFmuAa3sg+Pocyg7pDbJ4DCKEzHVorhc4wClXZnQjJcykuArnh4XuLaW8hewiQSY1xPR4jJCuAz/DLgJzOVL4tM5BPnVZVWfSkdHktAw3rQJ0lbnQzsYU8h0VrBMXJHy5QM2QsX+WBy2jvQ4zQdc2iodmZ+OVWt2aq4ZLJ0V3qWU9Jb6VlqZqHvMayGS3n0gCpHY5J65OTzaQXBOWtTgc0agOx0IizFzG2Lncmubq39xdHooBQ2CxPZc7rX5dJUirmdt0TkoWk9JqWMB0+oFbkOyoBdqENFkCLQ8AC2Mi5c0gbFZ4mvssvo8q9yNRQO/RHFH9WF6a2aeqo6by7328CcfJixsSNloktB3fDoKo6TKSbZPFBdZDX4zLd2MjUjhRdqYThi/FQsiYbGjJNHY6jWU/o/mIdfBtTGFdMKhsRtuCDfFGi0dnf/oAOexLcPB/qkOW1J2KiIf+Lw2GYMnmGc/gIe3sEI+WnbHUO03O51Qihaf0Dn8Jw6kuNP8WJhux9Be4p/7WFKELL3RWAnWaZdA6j5aFXTpN+xlYt6PaKfcg6imw9Pol5e1Pj8CBbNnEkruALMG2EjHYYsh+BMEXfKGl7EhHPfwErFZTpfkQ43O1gT0XW7LMfZxuBkkrJucL1MXLr9SLLIY9IG7bikgIgleAHhwM1p57MLgBcDK0zAH/uxe/gDnTt4qP1nXwWizfTG/qb rpUS2OlU GriNXN+8SRCiO0KpA9Ido1Ad/BT6nN876Hh4mZTaOXmYt2QFR+gv2u2g7s2PdEbkacyKnu32UTclmBuXahnSlVcG2DXUrtPnGckHkkxz3gGPg70XtVo/potpVkCRz5DVukOcxR/PDtqSupSPg39mDnVR7546laC3W8t8jtJt2yf0avCq6ZPIPZ5ruJw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This patch makes it easier for IAA hardware acceleration in the iaa_crypto driver to be loaded by default in the most efficient/recommended "async" mode, namely, asynchronous submission of descriptors, followed by polling for job completions. Earlier, the "sync" mode used to be the default. This way, anyone that wants to use IAA can do so after building the kernel, and *without* having to go through these steps to use async poll: 1) disable all the IAA device/wq bindings that happen at boot time 2) rmmod iaa_crypto 3) modprobe iaa_crypto 4) echo async > /sys/bus/dsa/drivers/crypto/sync_mode 5) re-run initialization of the IAA devices and wqs Signed-off-by: Kanchana P Sridhar --- drivers/crypto/intel/iaa/iaa_crypto_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/crypto/intel/iaa/iaa_crypto_main.c b/drivers/crypto/intel/iaa/iaa_crypto_main.c index 6a8577ac1330..6c262b1eb09d 100644 --- a/drivers/crypto/intel/iaa/iaa_crypto_main.c +++ b/drivers/crypto/intel/iaa/iaa_crypto_main.c @@ -153,7 +153,7 @@ static DRIVER_ATTR_RW(verify_compress); */ /* Use async mode */ -static bool async_mode; +static bool async_mode = true; /* Use interrupts */ static bool use_irq; From patchwork Fri Oct 18 06:40:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Sridhar, Kanchana P" X-Patchwork-Id: 13841233 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 191DBD3C54C for ; Fri, 18 Oct 2024 06:41:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 839966B009A; Fri, 18 Oct 2024 02:41:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7E8306B0098; Fri, 18 Oct 2024 02:41:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 63B986B0099; Fri, 18 Oct 2024 02:41:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 3FA5F6B0095 for ; Fri, 18 Oct 2024 02:41:11 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 6171FC04D7 for ; Fri, 18 Oct 2024 06:40:58 +0000 (UTC) X-FDA: 82685775648.05.BA8243A Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) by imf27.hostedemail.com (Postfix) with ESMTP id 3205140005 for ; Fri, 18 Oct 2024 06:40:58 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=QZKRiaER; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf27.hostedemail.com: domain of kanchana.p.sridhar@intel.com designates 192.198.163.15 as permitted sender) smtp.mailfrom=kanchana.p.sridhar@intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729233550; a=rsa-sha256; cv=none; b=iyfTkLJU/Xw30pS7AbVpUKWp9UuJW4dbHTa0qisVzlCoUHGQTfyyHYd4Cx4PLa264F8j0z bAE9y+ZBv6/E9BXoufkeeGP4OAU/nL5G5SjRMdVHDKBkdj+2vjhvujA3sEXQtEhh0eS7u0 n3ChC9xcKZt5vVvm273WKw0FYHVB4VQ= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=QZKRiaER; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf27.hostedemail.com: domain of kanchana.p.sridhar@intel.com designates 192.198.163.15 as permitted sender) smtp.mailfrom=kanchana.p.sridhar@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729233550; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=qifxFL7WKgPi3n67HmjuZ7i0DYtwFJEouMFmP8+eMD8=; b=bz2fWCeaUvkbrfpD/kk4yRkJ3qwgjiHulo0RPizDSLeRjq7QbOLjGMvWNVcPm5hYMbekHY LJUJUasGRNyLIdIA+5+pmuBCdYZAvsapLaG+vuKFAlFlawNofu5Q5nOEezi9aRVX5HKeVE QnQUMMz0yDXU7FRGxLdYw0SMcCQnEUo= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1729233669; x=1760769669; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=EYVB88p4llixVMUciVyNq2hbHLQAzfv8NwXmpPaNhRE=; b=QZKRiaERvUjKnC9HV/Z+9VIOjzmR0YXr+Q8RULVj+yaKs5SkisqsxjRi LsXHcjcSQSj0ZM/HSU96vqg7RzbG9U/rG6oBgIy4G1xIF3UBAV/Ja+3sg xPYrsKm7J05tL5prYbhyx+MB1ryBcO/6GE43JUIYmZHjyyvCwENunTmFX v41dI4otXILyleGdJmVQ8P30cXNT8labu1Yhc5bzNy7hByouTpTnrMR4T 71FTh7dICGzSW38+TeQ60X9rt0NOcZa+S+qxc+kYzpH95iH/ciz8DwZ5T QGmYOaVtOSqYkazSR77MSZqEaALq/rr+cmSk9sk9tQIUUPWqHZ8Iz7TJg w==; X-CSE-ConnectionGUID: 1FT/YBJORnWS8XEIvRqQLw== X-CSE-MsgGUID: sVWdXnb6RuaFbtyvzgFQBQ== X-IronPort-AV: E=McAfee;i="6700,10204,11228"; a="28884861" X-IronPort-AV: E=Sophos;i="6.11,212,1725346800"; d="scan'208";a="28884861" Received: from orviesa003.jf.intel.com ([10.64.159.143]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Oct 2024 23:41:03 -0700 X-CSE-ConnectionGUID: lhuPBnn6Sp2tniOZwmYnLg== X-CSE-MsgGUID: RubBN4GeQEObYR11REdZpA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.11,199,1725346800"; d="scan'208";a="83607509" Received: from jf5300-b11a338t.jf.intel.com ([10.242.51.6]) by orviesa003.jf.intel.com with ESMTP; 17 Oct 2024 23:41:03 -0700 From: Kanchana P Sridhar To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, hannes@cmpxchg.org, yosryahmed@google.com, nphamcs@gmail.com, chengming.zhou@linux.dev, usamaarif642@gmail.com, ryan.roberts@arm.com, ying.huang@intel.com, 21cnbao@gmail.com, akpm@linux-foundation.org, linux-crypto@vger.kernel.org, herbert@gondor.apana.org.au, davem@davemloft.net, clabbe@baylibre.com, ardb@kernel.org, ebiggers@google.com, surenb@google.com, kristen.c.accardi@intel.com, zanussi@kernel.org, viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz, mcgrof@kernel.org, kees@kernel.org, joel.granados@kernel.org, bfoster@redhat.com, willy@infradead.org, linux-fsdevel@vger.kernel.org Cc: wajdi.k.feghali@intel.com, vinodh.gopal@intel.com, kanchana.p.sridhar@intel.com Subject: [RFC PATCH v1 06/13] crypto: iaa - Disable iaa_verify_compress by default. Date: Thu, 17 Oct 2024 23:40:54 -0700 Message-Id: <20241018064101.336232-7-kanchana.p.sridhar@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20241018064101.336232-1-kanchana.p.sridhar@intel.com> References: <20241018064101.336232-1-kanchana.p.sridhar@intel.com> MIME-Version: 1.0 X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 3205140005 X-Stat-Signature: kfm83dicous8mkfcgs1qpi9w7bxo9mod X-Rspam-User: X-HE-Tag: 1729233658-67841 X-HE-Meta: U2FsdGVkX19X9QhfIIRzuukC5KQgcTLMVhefAH2s6E7NXtzk/dvSdxCStp7anYZTcEX6NGmdruthAgMyryfizbTgrLvXCpNs6xR+8DJwnW5z98f2cnpjEtJOs7hs0mY12Y5MPseYIng1OLPpcMTJ1Lxwf5TLF08XUexGJ8n+PDYt2J8uBmmi2ru7LbYrBhq2+waxNDDeZQvr99aEw+aouaiu9NIamAAtH/lk+4LjVAL/xvc3hKjJqCIwrCxn07yLlhytwgPlpWEDYNbwOvkjfgJJysYcBorw3+DxzW0/OS7DdLJcRF0lUFiqpAujDWhd00fScF657fDz1VDzvJ2Bo/hYeSK1cqFpYc8zK0eE6HuitjV16TjSQ20dyDFTrD/xpB5hnpKCXIUfBhySRzteAKcsBf6Av1PD/iZiEcaXQukFvSzeyTZsFQosYWNl348U5Pz5QEQhVfV/jzx9a7SsqhXuN5e2Vk5ofpag4F7GgexoJg4R2eVkFxbfCYaEdeHSMmSAymJYvUj7f0bIMsRYRGt64fxICxYktoEVaVlThKcWPPV33qJKZBT4q/1yEzW5J07AFcbWTe+Vpg/HN33kX0X3aGRoXf1fYYy7+s3HMorOF5JEAxoKdOcR09F62XenszNUf1e0uDIINudaAghdX3/8r0i5PvJQjYbLefTzSKipwt7zfSGcoFy0at9TrPX38O1jT4C83FQoYCoe1Sx0bWpsLPyxuVEPapguD9gLQP2+AlfN1LBAKRZKjjrJYlpHZ2+gNyE7x/po7o5k38FS4pFFSRt3fdIqqWsPKvsEcMS5auY+4CqafuQLLyT4Pp4KD9gY+ljWAeqkuuDvKGNrTrzH/SCV16zGcbz/npCl9VVv50uD7aYuKUrUdy7Wh4D1ipvrIQvxkCVq2upJjXcUc6clJdcn3sxoH3OlLDJMShwoOuAmGlnlO/ceUYs/SnINrfyjYHrJmIGbvRnAasn 99cYwdae WYiELPr5ylMo4V9I5MQW4XYKUUs+IAozXNl62lkH92vboy4VVSPZRM6xh3msqtV06Z8EpCMadxc52pPpU85Yf6YUfbe2ZACfqf4ZW231i/vb5JcVYGCXFsBJbw3fpOlnpBtZwpyN8br32i1EwAxVyYsPIP+WPZztMV85b+3PvINnn44ivWLVP5JGPfdXpSNgkiDXJFhd417Cy7HbsLRHShAwU3hQnZ3ZFYNZyhKNWPCB3rvk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This patch makes it easier for IAA hardware acceleration in the iaa_crypto driver to be loaded by default with "iaa_verify_compress" disabled, to facilitate performance comparisons with software compressors (which also do not run compress verification by default). Earlier, iaa_crypto compress verification used to be enabled by default. With this patch, if users want to enable compress verification, they can do so with these steps: 1) disable all the IAA device/wq bindings that happen at boot time 2) rmmod iaa_crypto 3) modprobe iaa_crypto 4) echo 1 > /sys/bus/dsa/drivers/crypto/verify_compress 5) re-run initialization of the IAA devices and wqs Signed-off-by: Kanchana P Sridhar --- drivers/crypto/intel/iaa/iaa_crypto_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/crypto/intel/iaa/iaa_crypto_main.c b/drivers/crypto/intel/iaa/iaa_crypto_main.c index 6c262b1eb09d..8e6859c97970 100644 --- a/drivers/crypto/intel/iaa/iaa_crypto_main.c +++ b/drivers/crypto/intel/iaa/iaa_crypto_main.c @@ -94,7 +94,7 @@ static bool iaa_crypto_enabled; static bool iaa_crypto_registered; /* Verify results of IAA compress or not */ -static bool iaa_verify_compress = true; +static bool iaa_verify_compress = false; static ssize_t verify_compress_show(struct device_driver *driver, char *buf) { From patchwork Fri Oct 18 06:40:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Sridhar, Kanchana P" X-Patchwork-Id: 13841234 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8D809D3C550 for ; Fri, 18 Oct 2024 06:41:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BBD976B0095; Fri, 18 Oct 2024 02:41:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B1B406B0099; Fri, 18 Oct 2024 02:41:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 836796B0095; Fri, 18 Oct 2024 02:41:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 585C46B0096 for ; Fri, 18 Oct 2024 02:41:11 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id D67A8160521 for ; Fri, 18 Oct 2024 06:40:57 +0000 (UTC) X-FDA: 82685775522.27.B73FDF0 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) by imf24.hostedemail.com (Postfix) with ESMTP id CBBFA180011 for ; Fri, 18 Oct 2024 06:41:06 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=caziR4+x; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf24.hostedemail.com: domain of kanchana.p.sridhar@intel.com designates 192.198.163.15 as permitted sender) smtp.mailfrom=kanchana.p.sridhar@intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729233561; a=rsa-sha256; cv=none; b=ZOm9GhXlYgU8uXLqeXByWTu+n7x0YZN3CuEsC5J5e0On2GZFeViHHQraAdo7zBHMGJdF0a mo/Jg7R9t9aosggb1DaCUtz71fJrtMTYCLa9BahN0UuTSO5wrGGn8tZZ0Agy1st/jzbpMY aIKAVnpa4aP8qhtxXxBStullsa86Wfc= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=caziR4+x; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf24.hostedemail.com: domain of kanchana.p.sridhar@intel.com designates 192.198.163.15 as permitted sender) smtp.mailfrom=kanchana.p.sridhar@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729233561; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=6teQWrRk5ViUbGxsxYU0nUWeKM6iU84VA4tAM6R5lbo=; b=l5hnrCqZC9fr7weFZ8TQDh7VXm+ljMXPVnTbzBiPRmG74R8Bb8N2P63X7dd95FyupjD9fb fV0j8brmwUwEoE15KGUfnX+nnXOEVuoybim2Y1hU544v7ZeYpjoJgoUFlEMf7v0coSGSUT WTcoJI08Jb/NWkDd56Tkf0UqyOLKb6c= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1729233669; x=1760769669; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=GA4zNC/BA7uVqXos9F/5h3uPYjHVZ8vkHQMHlvAZyyg=; b=caziR4+x5br7ZIBcSRHRETqWoOmqIONiPZ/XPKlF8B0ai+T9d+48P3Xh 2r1xnBZBxhiuEwwDC9CMN0yJXuToIpUdoJcj/d0V+95Zg6eRNisWFlYbO IXeHOMfsFB0h0lauJSOd5JSWOmDd4UyebkcwK0BnC0psW0x83gpEOqNCh QvcTUm8UiDwC2sYHjLcMe0ZuFdQSd97xW8tNax03yxALuD1BpBbMvCyq/ FxYZO2jhEDY2MjCz23mzjv7QYUA1DO1Fc2zOL37aCFnuBMNUY1xRs5rg1 oXRmfXIMolyi/FPEUlawacjKdMZjds9Qqhj4NETH0TiieEAArg6S9TEul Q==; X-CSE-ConnectionGUID: tU66HmiPRj+onFG3fzL74g== X-CSE-MsgGUID: TkBcT5fNT+iCQiUdk9heug== X-IronPort-AV: E=McAfee;i="6700,10204,11228"; a="28884878" X-IronPort-AV: E=Sophos;i="6.11,212,1725346800"; d="scan'208";a="28884878" Received: from orviesa003.jf.intel.com ([10.64.159.143]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Oct 2024 23:41:03 -0700 X-CSE-ConnectionGUID: zH0hGZmDQVO7EAYVusV2Hw== X-CSE-MsgGUID: ymW4e2aWT3qeiSwr7IcC9g== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.11,199,1725346800"; d="scan'208";a="83607514" Received: from jf5300-b11a338t.jf.intel.com ([10.242.51.6]) by orviesa003.jf.intel.com with ESMTP; 17 Oct 2024 23:41:03 -0700 From: Kanchana P Sridhar To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, hannes@cmpxchg.org, yosryahmed@google.com, nphamcs@gmail.com, chengming.zhou@linux.dev, usamaarif642@gmail.com, ryan.roberts@arm.com, ying.huang@intel.com, 21cnbao@gmail.com, akpm@linux-foundation.org, linux-crypto@vger.kernel.org, herbert@gondor.apana.org.au, davem@davemloft.net, clabbe@baylibre.com, ardb@kernel.org, ebiggers@google.com, surenb@google.com, kristen.c.accardi@intel.com, zanussi@kernel.org, viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz, mcgrof@kernel.org, kees@kernel.org, joel.granados@kernel.org, bfoster@redhat.com, willy@infradead.org, linux-fsdevel@vger.kernel.org Cc: wajdi.k.feghali@intel.com, vinodh.gopal@intel.com, kanchana.p.sridhar@intel.com Subject: [RFC PATCH v1 07/13] crypto: iaa - Change cpu-to-iaa mappings to evenly balance cores to IAAs. Date: Thu, 17 Oct 2024 23:40:55 -0700 Message-Id: <20241018064101.336232-8-kanchana.p.sridhar@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20241018064101.336232-1-kanchana.p.sridhar@intel.com> References: <20241018064101.336232-1-kanchana.p.sridhar@intel.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: CBBFA180011 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: odi3z839akxzrdrzcdcdp9wfnmq9szn3 X-HE-Tag: 1729233666-405874 X-HE-Meta: U2FsdGVkX18Sj7nA29iU/M8841Ow1a5+9YdRZWeV4Zq9OsnXzwahgHbM3kECff8deKgEvvSMHFVIY/9Xd0Gvtzo4TEL37sYEJjITSnDQN+77kBOo76j/QfD/ZuAhXZNLlHVPI8htVxaUNETz2HIHwOPQ1XMiciOb2ezEUVOejtvPXB0MYjgvhdmyoA25KSQIqA6X7EnExn4rfF3fOGQa12ys+D/Tuc6aveBT/xlJ6u3A2Aq7aVzbq36PO6YKQsZEL2Q1+5OA9ISDYA44xtbWy5sDX6UcMhG2npD4L1W7Rrz1BbAerJTtnQioIDoH4sYeFtiVCqUIOg+u8+zmJQm/o0jp4vMlAwis2CLk7pqy6DAYDcFz/bF0LgNIm2wKVCYej9Q13vL/nV2eZxkrmV3UflL5F8+4k/NlKsYyprry6QNkPpDjxRuX2qlAMBB+YGnzFArfEgpGr4ES1+lX7jEiVT4HIpdXwzUgkV3N1n00lBReYRdJSbGbg9j5EYaDJhuy7iZIDzyZNAR9BzAyaXN6b1uFgcx8650c+R03oiWKZt1bArSuT/4+j98NgrGOWQH0wXYVmwRKEtN1mth9qQpeTXe4dxyXmpfSZJR7TVlCe0EsFxY4E75MI8Na+72PX3AZt0mxLHcQVkb4yVf57BSCLviALe8sMg6U7sMAxSvOo0QA0FlE4JON14oJnxI02ipmH9MUbQao7VbJ5eCAj1N1opde0jzzqLEyX/sf34Ss2pnBQvXHXzIrScYyNAwGxfqxYTHt4i0P0v6lzlSnxDrWLGf71C2TpuyBZBK2zhFxhzOkcVs6XW0e6heYMDSA1wyGQjn8Z0VXOCX2FBGvmm03w7iFEsptOPb429JF/5R9r3sDB6zqID5akml2NDLD+bTb8aWU/NXn52RCCzOEcC7iWImkgCOLkROGwf1fOHmgj0fXQD42TbI2+lfQ7gpHJG0J4EbJyAS64AWVIBTeroT HgYdIryG eRsZgD4ik2UTJrrvAxgD+ZFqlLT3yoOV04IPrnEyVjBpdg2gHU80Rhh2TybzYjJ55elC6PkYs2f/8b2yKeRwMiTbaIjc+HzJTtopU81vBMC+UoVaG+pRlpCjCEBKZ5fW0ydVI9hm0/U7khpH2vFFcezDUeqzzm1iI+loLtKQiAC2/iODtG9KOPcPQ3w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This change distributes the cpus more evenly among the IAAs in each socket. Old algorithm to assign cpus to IAA: ------------------------------------ If "nr_cpus" = nr_logical_cpus (includes hyper-threading), the current algorithm determines "nr_cpus_per_node" = nr_cpus / nr_nodes. Hence, on a 2-socket Sapphire Rapids server where each socket has 56 cores and 4 IAA devices, nr_cpus_per_node = 112. Further, cpus_per_iaa = (nr_nodes * nr_cpus_per_node) / nr_iaa Hence, cpus_per_iaa = 224/8 = 28. The iaa_crypto driver then assigns 28 "logical" node cpus per IAA device on that node, that results in this cpu-to-iaa mapping: lscpu|grep NUMA NUMA node(s): 2 NUMA node0 CPU(s): 0-55,112-167 NUMA node1 CPU(s): 56-111,168-223 NUMA node 0: cpu 0-27 28-55 112-139 140-167 iaa iax1 iax3 iax5 iax7 NUMA node 1: cpu 56-83 84-111 168-195 196-223 iaa iax9 iax11 iax13 iax15 This appears non-optimal for a few reasons: 1) The 2 logical threads on a core will get assigned to different IAA devices. For e.g.: cpu 0: iax1 cpu 112: iax5 2) One of the logical threads on a core is assigned to an IAA that is not closest to that core. For e.g. cpu 112. 3) If numactl is used to start processes sequentially on the logical cores, some of the IAA devices on the socket could be over-subscribed, while some could be under-utilized. This patch introduces a scheme to more evenly balance the logical cores to IAA devices on a socket. New algorithm to assign cpus to IAA: ------------------------------------ We introduce a function "cpu_to_iaa()" that takes a logical cpu and returns the IAA device closest to it. If "nr_cpus" = nr_logical_cpus (includes hyper-threading), the new algorithm determines "nr_cpus_per_node" = topology_num_cores_per_package(). Hence, on a 2-socket Sapphire Rapids server where each socket has 56 cores and 4 IAA devices, nr_cpus_per_node = 56. Further, cpus_per_iaa = (nr_nodes * nr_cpus_per_node) / nr_iaa Hence, cpus_per_iaa = 112/8 = 14. The iaa_crypto driver then assigns 14 "logical" node cpus per IAA device on that node, that results in this cpu-to-iaa mapping: NUMA node 0: cpu 0-13,112-125 14-27,126-139 28-41,140-153 42-55,154-167 iaa iax1 iax3 iax5 iax7 NUMA node 1: cpu 56-69,168-181 70-83,182-195 84-97,196-209 98-111,210-223 iaa iax9 iax11 iax13 iax15 This resolves the 3 issues with non-optimality of cpu-to-iaa mappings pointed out earlier with the existing approach. Originally-by: Tom Zanussi Signed-off-by: Kanchana P Sridhar --- drivers/crypto/intel/iaa/iaa_crypto_main.c | 84 ++++++++++++++-------- 1 file changed, 54 insertions(+), 30 deletions(-) diff --git a/drivers/crypto/intel/iaa/iaa_crypto_main.c b/drivers/crypto/intel/iaa/iaa_crypto_main.c index 8e6859c97970..c854a7a1aaa4 100644 --- a/drivers/crypto/intel/iaa/iaa_crypto_main.c +++ b/drivers/crypto/intel/iaa/iaa_crypto_main.c @@ -55,6 +55,46 @@ static struct idxd_wq *wq_table_next_wq(int cpu) return entry->wqs[entry->cur_wq]; } +/* + * Given a cpu, find the closest IAA instance. The idea is to try to + * choose the most appropriate IAA instance for a caller and spread + * available workqueues around to clients. + */ +static inline int cpu_to_iaa(int cpu) +{ + int node, n_cpus = 0, test_cpu, iaa = 0; + int nr_iaa_per_node; + const struct cpumask *node_cpus; + + if (!nr_nodes) + return 0; + + nr_iaa_per_node = nr_iaa / nr_nodes; + if (!nr_iaa_per_node) + return 0; + + for_each_online_node(node) { + node_cpus = cpumask_of_node(node); + if (!cpumask_test_cpu(cpu, node_cpus)) + continue; + + for_each_cpu(test_cpu, node_cpus) { + if ((n_cpus % nr_cpus_per_node) == 0) + iaa = node * nr_iaa_per_node; + + if (test_cpu == cpu) + return iaa; + + n_cpus++; + + if ((n_cpus % cpus_per_iaa) == 0) + iaa++; + } + } + + return -1; +} + static void wq_table_add(int cpu, struct idxd_wq *wq) { struct wq_table_entry *entry = per_cpu_ptr(wq_table, cpu); @@ -895,8 +935,7 @@ static int wq_table_add_wqs(int iaa, int cpu) */ static void rebalance_wq_table(void) { - const struct cpumask *node_cpus; - int node, cpu, iaa = -1; + int cpu, iaa; if (nr_iaa == 0) return; @@ -906,37 +945,22 @@ static void rebalance_wq_table(void) clear_wq_table(); - if (nr_iaa == 1) { - for (cpu = 0; cpu < nr_cpus; cpu++) { - if (WARN_ON(wq_table_add_wqs(0, cpu))) { - pr_debug("could not add any wqs for iaa 0 to cpu %d!\n", cpu); - return; - } - } - - return; - } - - for_each_node_with_cpus(node) { - node_cpus = cpumask_of_node(node); - - for (cpu = 0; cpu < cpumask_weight(node_cpus); cpu++) { - int node_cpu = cpumask_nth(cpu, node_cpus); - - if (WARN_ON(node_cpu >= nr_cpu_ids)) { - pr_debug("node_cpu %d doesn't exist!\n", node_cpu); - return; - } + for (cpu = 0; cpu < nr_cpus; cpu++) { + iaa = cpu_to_iaa(cpu); + pr_debug("rebalance: cpu=%d iaa=%d\n", cpu, iaa); - if ((cpu % cpus_per_iaa) == 0) - iaa++; + if (WARN_ON(iaa == -1)) { + pr_debug("rebalance (cpu_to_iaa(%d)) failed!\n", cpu); + return; + } - if (WARN_ON(wq_table_add_wqs(iaa, node_cpu))) { - pr_debug("could not add any wqs for iaa %d to cpu %d!\n", iaa, cpu); - return; - } + if (WARN_ON(wq_table_add_wqs(iaa, cpu))) { + pr_debug("could not add any wqs for iaa %d to cpu %d!\n", iaa, cpu); + return; } } + + pr_debug("Finished rebalance local wqs."); } static inline int check_completion(struct device *dev, @@ -2084,7 +2108,7 @@ static int __init iaa_crypto_init_module(void) pr_err("IAA couldn't find any nodes with cpus\n"); return -ENODEV; } - nr_cpus_per_node = nr_cpus / nr_nodes; + nr_cpus_per_node = topology_num_cores_per_package(); if (crypto_has_comp("deflate-generic", 0, 0)) deflate_generic_tfm = crypto_alloc_comp("deflate-generic", 0, 0); From patchwork Fri Oct 18 06:40:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Sridhar, Kanchana P" X-Patchwork-Id: 13841235 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5A775D3C54C for ; Fri, 18 Oct 2024 06:41:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1347B6B0098; Fri, 18 Oct 2024 02:41:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0E4FA6B0099; Fri, 18 Oct 2024 02:41:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E034F6B009B; Fri, 18 Oct 2024 02:41:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id BB83D6B0098 for ; Fri, 18 Oct 2024 02:41:12 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id DACBB1205CB for ; Fri, 18 Oct 2024 06:41:01 +0000 (UTC) X-FDA: 82685775732.15.CA6C420 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) by imf19.hostedemail.com (Postfix) with ESMTP id EACF91A000C for ; Fri, 18 Oct 2024 06:40:56 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=JaKljNEt; spf=pass (imf19.hostedemail.com: domain of kanchana.p.sridhar@intel.com designates 192.198.163.15 as permitted sender) smtp.mailfrom=kanchana.p.sridhar@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729233523; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=jzdwlumSG2pr39IYIrbXBX52SXwXZHtVtLY9ZhRPJLU=; b=JthwI5qVievQP1vwAFEljVgfJ9AFjNAAjwRryh+YNkDTvnKqazw5P6feFJqHyXrBmIPOwT uhIwIh4fTLDTarmjKS/C42Jw0pdigwSrz5mRB1CImbtpUcjpKju2R0+IftjUC4wxqofNvV oF/BTn9UVeM+eqWfiN2hRtVjL5Awbxg= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729233523; a=rsa-sha256; cv=none; b=ndAfMGvB3tRKymDyfh8b5Ebtaij2vyHYkewTLc7lKEo+M0xox4AiTP+gXMPAOCk8O5i+ZF TUCpgFXYgc/UqmbGCn5VN0s+hxRfa8kaXmRbvnIr9wClD3it3ldfUNHPwzHSLJNdpQsG2U RI+DQpsSiCWMWd5Phf78s8/oIFKcLFE= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=JaKljNEt; spf=pass (imf19.hostedemail.com: domain of kanchana.p.sridhar@intel.com designates 192.198.163.15 as permitted sender) smtp.mailfrom=kanchana.p.sridhar@intel.com; dmarc=pass (policy=none) header.from=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1729233670; x=1760769670; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=nJkwHZhq8LUwIQWuM4nmuZlPd/LXd0bS1ate2IE2Ous=; b=JaKljNEtPy1LrAF3kIQOhqSRLXlEP5usoEVcRHtoyFRJ7v2dyoFzgGht a3x8DiiL8Z3ftqt1DXAUHbPbMK/bU6Xe6JnrCWgkBvMjbl9CpyNwJJ8/s sLeHCnx1ymndX9l31ZPn02s6CNgavs05bHSETUBOPEBJRDd1lXjLRWtbU BHJP6pyiNu+ThAERdK3ch1QmrKUz20NFgTvH3ObvfQp/PHnHOWb/XZd+T sORn9ABn1WUWV1CI4qO0gI/ZoP9LAZT8mp/a0xDsKNqAqlCrnl44g6C2F b+YDHkVFGe/C2ECDc4kvfClhNaTcXhnosJzzNs1UkRAD70VbiFnX7iFge w==; X-CSE-ConnectionGUID: QRDfSgnERPu3M715853bjg== X-CSE-MsgGUID: 7C/9rk+dSCOyS9nGmi5yOQ== X-IronPort-AV: E=McAfee;i="6700,10204,11228"; a="28884893" X-IronPort-AV: E=Sophos;i="6.11,212,1725346800"; d="scan'208";a="28884893" Received: from orviesa003.jf.intel.com ([10.64.159.143]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Oct 2024 23:41:03 -0700 X-CSE-ConnectionGUID: zewXamuiRemTPufLHe37/A== X-CSE-MsgGUID: j/tcH1zEQrytip/oRjhc+w== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.11,199,1725346800"; d="scan'208";a="83607519" Received: from jf5300-b11a338t.jf.intel.com ([10.242.51.6]) by orviesa003.jf.intel.com with ESMTP; 17 Oct 2024 23:41:03 -0700 From: Kanchana P Sridhar To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, hannes@cmpxchg.org, yosryahmed@google.com, nphamcs@gmail.com, chengming.zhou@linux.dev, usamaarif642@gmail.com, ryan.roberts@arm.com, ying.huang@intel.com, 21cnbao@gmail.com, akpm@linux-foundation.org, linux-crypto@vger.kernel.org, herbert@gondor.apana.org.au, davem@davemloft.net, clabbe@baylibre.com, ardb@kernel.org, ebiggers@google.com, surenb@google.com, kristen.c.accardi@intel.com, zanussi@kernel.org, viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz, mcgrof@kernel.org, kees@kernel.org, joel.granados@kernel.org, bfoster@redhat.com, willy@infradead.org, linux-fsdevel@vger.kernel.org Cc: wajdi.k.feghali@intel.com, vinodh.gopal@intel.com, kanchana.p.sridhar@intel.com Subject: [RFC PATCH v1 08/13] crypto: iaa - Distribute compress jobs to all IAA devices on a NUMA node. Date: Thu, 17 Oct 2024 23:40:56 -0700 Message-Id: <20241018064101.336232-9-kanchana.p.sridhar@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20241018064101.336232-1-kanchana.p.sridhar@intel.com> References: <20241018064101.336232-1-kanchana.p.sridhar@intel.com> MIME-Version: 1.0 X-Stat-Signature: hjmrwzacahnfdm1q7zbcus6ugn7sp99d X-Rspamd-Queue-Id: EACF91A000C X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1729233656-988653 X-HE-Meta: U2FsdGVkX1+RQwBC8Hea6UrAW8h9DdSHmz/gT1KJwvLqWLgOut7ROgoQC6JEB5urnIz6XVePcZK+/6PA690iU2a/xiK88LoN7Hd4/E4IStYeP6EjZCgxiWlxuIDBxuTKYKHA3Rx5sVw0Po0bDfzIdhbsg9aHiDItaBpo/7vtaMu3o/dxR9E+d6MMtgTm34lCKkX7ow4SyF7xkLay+04DQpRpcieo7PCHAl+r96riXT/uQKyyafVEZkNKDXN8T481/ho7xC9PwDFChKdegaLRKxIpG5ZHh0Mfna1ygkztSXZ2Vtw0K/A1hMXfPzqkWhfRfjDiFNf/d+j8SjLHeQLcl+S7thdVaGQHmet0j3YKiF45/rpcer1QQRfzK/ClcQRveqNoNe+nIKySMRmiMgJiVv4q4G+gnajif4Ojs/cCvxM+wtQ04xTwTucQNYxFgFvbJqga1puh40RmLpMTzvcGbVi0jVDW5WA/XXam5Gp/tEQVlrDhitrmWxjgZpLit9P42WP+k073Mh6SkcB9J9DvWqLTC/Av3G1pQIBi4ghlSApYO9UuhE2Wi1P9quaY+iGQ7bJGpY11dK6OyFwtV0s1EnvKqYLFXgkM8Ukx8SzWbTW0OFOfD0iI6VBN36G/MP9ldhSON0b7PUASj4q0mBXtaIu8EhpT1E3smvfYulrs1+OTR0gPYg3l7npRmA9swzjzWO2wBLe1BIOO0timo/CR6rQeQYKEnl1XgtAgobppQ/qlaD0ZeszQJdakgoXjdZk59JobdL82AvXKG+g1/e0wvd5bVDnIfDtX1gW+n0pkTCk4nebcuv9AxliFORvA5SIXLUe9A4uFTJ9Q+uxyOZb/H1XlAdZiBBfi5k6hTEY3dMoH07YNmpQlUAO2OukaCKiIc7PNES6/+/nfjV/W+qIdQkKaZ90TKqR7Ejzb203EV1TcJQ5+VZTn0w7YfEtxPZtbnVrY99NM2+1nT1K9VXl VK7CTai+ SSIgYBWCO9fVLgjnSTL3qNEIlm1mV3PZvqJ6bmh92B2hwbRjbq2apezgGn+pIV41oCZD1RAwVIzna2zrwC7eV6QKfDBa8W6vn5zKbNRw3eHj04qb8or5NAmYBq6raZ6J7TYtBUQUUaRwTrChQaaimAoa0vyil44HqeUNbltEvaAN+Qcvt09BRhtqiGQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This change enables processes running on any logical core on a NUMA node to use all the IAA devices enabled on that NUMA node for compress jobs. In other words, compressions originating from any process in a node will be distributed in round-robin manner to the available IAA devices on the same socket. The main premise behind this change is to make sure that no compress engines on any IAA device are left un-utilized/under-utilized. In other words, the compress engines on all IAA devices are considered a global resource for that socket. This allows the use of all IAA devices present in a given NUMA node for (batched) compressions originating from zswap/zram, from all cores on this node. A new per-cpu "global_wq_table" implements this in the iaa_crypto driver. We can think of the global WQ per IAA as a WQ to which all cores on that socket can submit compress jobs. To avail of this feature, the user must configure 2 WQs per IAA in order to enable distribution of compress jobs to multiple IAA devices. Each IAA will have 2 WQs: wq.0 (local WQ): Used for decompress jobs from cores mapped by the cpu_to_iaa() "even balancing of logical cores to IAA devices" algorithm. wq.1 (global WQ): Used for compress jobs from *all* logical cores on that socket. The iaa_crypto driver will place all global WQs from all same-socket IAA devices in the global_wq_table per cpu on that socket. When the driver receives a compress job, it will lookup the "next" global WQ in the cpu's global_wq_table to submit the descriptor. The starting wq in the global_wq_table for each cpu is the global wq associated with the IAA nearest to it, so that we stagger the starting global wq for each process. This results in very uniform usage of all IAAs for compress jobs. Two new driver module parameters are added for this feature: g_wqs_per_iaa (default 1): /sys/bus/dsa/drivers/crypto/g_wqs_per_iaa This represents the number of global WQs that can be configured per IAA device. The default is 1, and is the recommended setting to enable the use of this feature once the user configures 2 WQs per IAA using higher level scripts as described in Documentation/driver-api/crypto/iaa/iaa-crypto.rst. g_consec_descs_per_gwq (default 1): /sys/bus/dsa/drivers/crypto/g_consec_descs_per_gwq This represents the number of consecutive compress jobs that will be submitted to the same global WQ (i.e. to the same IAA device) from a given core, before moving to the next global WQ. The default is 1, which is also the recommended setting to avail of this feature. The decompress jobs from any core will be sent to the "local" IAA, namely the one that the driver assigns with the cpu_to_iaa() mapping algorithm that evenly balances the assignment of logical cores to IAA devices on a NUMA node. On a 2-socket Sapphire Rapids server where each socket has 56 cores and 4 IAA devices, this is how the compress/decompress jobs will be mapped when the user configures 2 WQs per IAA device (which implies wq.1 will be added to the global WQ table for each logical core on that NUMA node): lscpu|grep NUMA NUMA node(s): 2 NUMA node0 CPU(s): 0-55,112-167 NUMA node1 CPU(s): 56-111,168-223 Compress jobs: -------------- NUMA node 0: All cpus (0-55,112-167) can send compress jobs to all IAA devices on the socket (iax1/iax3/iax5/iax7) in round-robin manner: iaa iax1 iax3 iax5 iax7 NUMA node 1: All cpus (56-111,168-223) can send compress jobs to all IAA devices on the socket (iax9/iax11/iax13/iax15) in round-robin manner: iaa iax9 iax11 iax13 iax15 Decompress jobs: ---------------- NUMA node 0: cpu 0-13,112-125 14-27,126-139 28-41,140-153 42-55,154-167 iaa iax1 iax3 iax5 iax7 NUMA node 1: cpu 56-69,168-181 70-83,182-195 84-97,196-209 98-111,210-223 iaa iax9 iax11 iax13 iax15 Signed-off-by: Kanchana P Sridhar --- drivers/crypto/intel/iaa/iaa_crypto_main.c | 305 ++++++++++++++++++++- 1 file changed, 290 insertions(+), 15 deletions(-) diff --git a/drivers/crypto/intel/iaa/iaa_crypto_main.c b/drivers/crypto/intel/iaa/iaa_crypto_main.c index c854a7a1aaa4..2d6c517e9d9b 100644 --- a/drivers/crypto/intel/iaa/iaa_crypto_main.c +++ b/drivers/crypto/intel/iaa/iaa_crypto_main.c @@ -29,14 +29,23 @@ static unsigned int nr_iaa; static unsigned int nr_cpus; static unsigned int nr_nodes; static unsigned int nr_cpus_per_node; - /* Number of physical cpus sharing each iaa instance */ static unsigned int cpus_per_iaa; static struct crypto_comp *deflate_generic_tfm; /* Per-cpu lookup table for balanced wqs */ -static struct wq_table_entry __percpu *wq_table; +static struct wq_table_entry __percpu *wq_table = NULL; + +/* Per-cpu lookup table for global wqs shared by all cpus. */ +static struct wq_table_entry __percpu *global_wq_table = NULL; + +/* + * Per-cpu counter of consecutive descriptors allocated to + * the same wq in the global_wq_table, so that we know + * when to switch to the next wq in the global_wq_table. + */ +static int __percpu *num_consec_descs_per_wq = NULL; static struct idxd_wq *wq_table_next_wq(int cpu) { @@ -104,26 +113,68 @@ static void wq_table_add(int cpu, struct idxd_wq *wq) entry->wqs[entry->n_wqs++] = wq; - pr_debug("%s: added iaa wq %d.%d to idx %d of cpu %d\n", __func__, - entry->wqs[entry->n_wqs - 1]->idxd->id, - entry->wqs[entry->n_wqs - 1]->id, entry->n_wqs - 1, cpu); + pr_debug("%s: added iaa local wq %d.%d to idx %d of cpu %d\n", __func__, + entry->wqs[entry->n_wqs - 1]->idxd->id, + entry->wqs[entry->n_wqs - 1]->id, entry->n_wqs - 1, cpu); +} + +static void global_wq_table_add(int cpu, struct idxd_wq *wq) +{ + struct wq_table_entry *entry = per_cpu_ptr(global_wq_table, cpu); + + if (WARN_ON(entry->n_wqs == entry->max_wqs)) + return; + + entry->wqs[entry->n_wqs++] = wq; + + pr_debug("%s: added iaa global wq %d.%d to idx %d of cpu %d\n", __func__, + entry->wqs[entry->n_wqs - 1]->idxd->id, + entry->wqs[entry->n_wqs - 1]->id, entry->n_wqs - 1, cpu); +} + +static void global_wq_table_set_start_wq(int cpu) +{ + struct wq_table_entry *entry = per_cpu_ptr(global_wq_table, cpu); + int start_wq = (entry->n_wqs / nr_iaa) * cpu_to_iaa(cpu); + + if ((start_wq >= 0) && (start_wq < entry->n_wqs)) + entry->cur_wq = start_wq; } static void wq_table_free_entry(int cpu) { struct wq_table_entry *entry = per_cpu_ptr(wq_table, cpu); - kfree(entry->wqs); - memset(entry, 0, sizeof(*entry)); + if (entry) { + kfree(entry->wqs); + memset(entry, 0, sizeof(*entry)); + } + + entry = per_cpu_ptr(global_wq_table, cpu); + + if (entry) { + kfree(entry->wqs); + memset(entry, 0, sizeof(*entry)); + } } static void wq_table_clear_entry(int cpu) { struct wq_table_entry *entry = per_cpu_ptr(wq_table, cpu); - entry->n_wqs = 0; - entry->cur_wq = 0; - memset(entry->wqs, 0, entry->max_wqs * sizeof(struct idxd_wq *)); + if (entry) { + entry->n_wqs = 0; + entry->cur_wq = 0; + memset(entry->wqs, 0, entry->max_wqs * sizeof(struct idxd_wq *)); + } + + entry = per_cpu_ptr(global_wq_table, cpu); + + if (entry) { + entry->n_wqs = 0; + entry->cur_wq = 0; + memset(entry->wqs, 0, entry->max_wqs * sizeof(struct idxd_wq *)); + } } LIST_HEAD(iaa_devices); @@ -163,6 +214,70 @@ static ssize_t verify_compress_store(struct device_driver *driver, } static DRIVER_ATTR_RW(verify_compress); +/* Number of global wqs per iaa*/ +static int g_wqs_per_iaa = 1; + +static ssize_t g_wqs_per_iaa_show(struct device_driver *driver, char *buf) +{ + return sprintf(buf, "%d\n", g_wqs_per_iaa); +} + +static ssize_t g_wqs_per_iaa_store(struct device_driver *driver, + const char *buf, size_t count) +{ + int ret = -EBUSY; + + mutex_lock(&iaa_devices_lock); + + if (iaa_crypto_enabled) + goto out; + + ret = kstrtoint(buf, 10, &g_wqs_per_iaa); + if (ret) + goto out; + + ret = count; +out: + mutex_unlock(&iaa_devices_lock); + + return ret; +} +static DRIVER_ATTR_RW(g_wqs_per_iaa); + +/* + * Number of consecutive descriptors to allocate from a + * given global wq before switching to the next wq in + * the global_wq_table. + */ +static int g_consec_descs_per_gwq = 1; + +static ssize_t g_consec_descs_per_gwq_show(struct device_driver *driver, char *buf) +{ + return sprintf(buf, "%d\n", g_consec_descs_per_gwq); +} + +static ssize_t g_consec_descs_per_gwq_store(struct device_driver *driver, + const char *buf, size_t count) +{ + int ret = -EBUSY; + + mutex_lock(&iaa_devices_lock); + + if (iaa_crypto_enabled) + goto out; + + ret = kstrtoint(buf, 10, &g_consec_descs_per_gwq); + if (ret) + goto out; + + ret = count; +out: + mutex_unlock(&iaa_devices_lock); + + return ret; +} +static DRIVER_ATTR_RW(g_consec_descs_per_gwq); + /* * The iaa crypto driver supports three 'sync' methods determining how * compressions and decompressions are performed: @@ -751,7 +866,20 @@ static void free_wq_table(void) for (cpu = 0; cpu < nr_cpus; cpu++) wq_table_free_entry(cpu); - free_percpu(wq_table); + if (wq_table) { + free_percpu(wq_table); + wq_table = NULL; + } + + if (global_wq_table) { + free_percpu(global_wq_table); + global_wq_table = NULL; + } + + if (num_consec_descs_per_wq) { + free_percpu(num_consec_descs_per_wq); + num_consec_descs_per_wq = NULL; + } pr_debug("freed wq table\n"); } @@ -774,6 +902,38 @@ static int alloc_wq_table(int max_wqs) } entry->max_wqs = max_wqs; + entry->n_wqs = 0; + entry->cur_wq = 0; + } + + global_wq_table = alloc_percpu(struct wq_table_entry); + if (!global_wq_table) { + free_wq_table(); + return -ENOMEM; + } + + for (cpu = 0; cpu < nr_cpus; cpu++) { + entry = per_cpu_ptr(global_wq_table, cpu); + entry->wqs = kzalloc(GFP_KERNEL, max_wqs * sizeof(struct wq *)); + if (!entry->wqs) { + free_wq_table(); + return -ENOMEM; + } + + entry->max_wqs = max_wqs; + entry->n_wqs = 0; + entry->cur_wq = 0; + } + + num_consec_descs_per_wq = alloc_percpu(int); + if (!num_consec_descs_per_wq) { + free_wq_table(); + return -ENOMEM; + } + + for (cpu = 0; cpu < nr_cpus; cpu++) { + int *num_consec_descs = per_cpu_ptr(num_consec_descs_per_wq, cpu); + *num_consec_descs = 0; } pr_debug("initialized wq table\n"); @@ -912,9 +1072,14 @@ static int wq_table_add_wqs(int iaa, int cpu) } list_for_each_entry(iaa_wq, &found_device->wqs, list) { - wq_table_add(cpu, iaa_wq->wq); + + if (((found_device->n_wq - g_wqs_per_iaa) < 1) || + (n_wqs_added < (found_device->n_wq - g_wqs_per_iaa))) { + wq_table_add(cpu, iaa_wq->wq); + } + pr_debug("rebalance: added wq for cpu=%d: iaa wq %d.%d\n", - cpu, iaa_wq->wq->idxd->id, iaa_wq->wq->id); + cpu, iaa_wq->wq->idxd->id, iaa_wq->wq->id); n_wqs_added++; } @@ -927,6 +1092,63 @@ static int wq_table_add_wqs(int iaa, int cpu) return ret; } +static int global_wq_table_add_wqs(void) +{ + struct iaa_device *iaa_device; + int ret = 0, n_wqs_added; + struct idxd_device *idxd; + struct iaa_wq *iaa_wq; + struct pci_dev *pdev; + struct device *dev; + int cpu, node, node_of_cpu = -1; + + for (cpu = 0; cpu < nr_cpus; cpu++) { + +#ifdef CONFIG_NUMA + node_of_cpu = -1; + for_each_online_node(node) { + const struct cpumask *node_cpus; + node_cpus = cpumask_of_node(node); + if (!cpumask_test_cpu(cpu, node_cpus)) + continue; + node_of_cpu = node; + break; + } +#endif + list_for_each_entry(iaa_device, &iaa_devices, list) { + idxd = iaa_device->idxd; + pdev = idxd->pdev; + dev = &pdev->dev; + +#ifdef CONFIG_NUMA + if (dev && (node_of_cpu != dev->numa_node)) + continue; +#endif + + if (iaa_device->n_wq <= g_wqs_per_iaa) + continue; + + n_wqs_added = 0; + + list_for_each_entry(iaa_wq, &iaa_device->wqs, list) { + + if (n_wqs_added < (iaa_device->n_wq - g_wqs_per_iaa)) { + n_wqs_added++; + } + else { + global_wq_table_add(cpu, iaa_wq->wq); + pr_debug("rebalance: added global wq for cpu=%d: iaa wq %d.%d\n", + cpu, iaa_wq->wq->idxd->id, iaa_wq->wq->id); + } + } + } + + global_wq_table_set_start_wq(cpu); + } + + return ret; +} + /* * Rebalance the wq table so that given a cpu, it's easy to find the * closest IAA instance. The idea is to try to choose the most @@ -961,6 +1183,7 @@ static void rebalance_wq_table(void) } pr_debug("Finished rebalance local wqs."); + global_wq_table_add_wqs(); } static inline int check_completion(struct device *dev, @@ -1509,6 +1732,27 @@ static int iaa_decompress(struct crypto_tfm *tfm, struct acomp_req *req, goto out; } +/* + * Caller should make sure to call only if the + * per_cpu_ptr "global_wq_table" is non-NULL + * and has at least one wq configured. + */ +static struct idxd_wq *global_wq_table_next_wq(int cpu) +{ + struct wq_table_entry *entry = per_cpu_ptr(global_wq_table, cpu); + int *num_consec_descs = per_cpu_ptr(num_consec_descs_per_wq, cpu); + + if ((*num_consec_descs) == g_consec_descs_per_gwq) { + if (++entry->cur_wq >= entry->n_wqs) + entry->cur_wq = 0; + *num_consec_descs = 0; + } + + ++(*num_consec_descs); + + return entry->wqs[entry->cur_wq]; +} + static int iaa_comp_acompress(struct acomp_req *req) { struct iaa_compression_ctx *compression_ctx; @@ -1521,6 +1765,7 @@ static int iaa_comp_acompress(struct acomp_req *req) struct idxd_wq *wq; struct device *dev; int order = -1; + struct wq_table_entry *entry; compression_ctx = crypto_tfm_ctx(tfm); @@ -1535,8 +1780,15 @@ static int iaa_comp_acompress(struct acomp_req *req) } cpu = get_cpu(); - wq = wq_table_next_wq(cpu); + entry = per_cpu_ptr(global_wq_table, cpu); + + if (!entry || entry->n_wqs == 0) { + wq = wq_table_next_wq(cpu); + } else { + wq = global_wq_table_next_wq(cpu); + } put_cpu(); + if (!wq) { pr_debug("no wq configured for cpu=%d\n", cpu); return -ENODEV; @@ -2145,13 +2397,32 @@ static int __init iaa_crypto_init_module(void) goto err_sync_attr_create; } + ret = driver_create_file(&iaa_crypto_driver.drv, + &driver_attr_g_wqs_per_iaa); + if (ret) { + pr_debug("IAA g_wqs_per_iaa attr creation failed\n"); + goto err_g_wqs_per_iaa_attr_create; + } + + ret = driver_create_file(&iaa_crypto_driver.drv, + &driver_attr_g_consec_descs_per_gwq); + if (ret) { + pr_debug("IAA g_consec_descs_per_gwq attr creation failed\n"); + goto err_g_consec_descs_per_gwq_attr_create; + } + if (iaa_crypto_debugfs_init()) pr_warn("debugfs init failed, stats not available\n"); pr_debug("initialized\n"); out: return ret; - +err_g_consec_descs_per_gwq_attr_create: + driver_remove_file(&iaa_crypto_driver.drv, + &driver_attr_g_wqs_per_iaa); +err_g_wqs_per_iaa_attr_create: + driver_remove_file(&iaa_crypto_driver.drv, + &driver_attr_sync_mode); err_sync_attr_create: driver_remove_file(&iaa_crypto_driver.drv, &driver_attr_verify_compress); @@ -2175,6 +2446,10 @@ static void __exit iaa_crypto_cleanup_module(void) &driver_attr_sync_mode); driver_remove_file(&iaa_crypto_driver.drv, &driver_attr_verify_compress); + driver_remove_file(&iaa_crypto_driver.drv, + &driver_attr_g_wqs_per_iaa); + driver_remove_file(&iaa_crypto_driver.drv, + &driver_attr_g_consec_descs_per_gwq); idxd_driver_unregister(&iaa_crypto_driver); iaa_aecs_cleanup_fixed(); crypto_free_comp(deflate_generic_tfm); From patchwork Fri Oct 18 06:40:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Sridhar, Kanchana P" X-Patchwork-Id: 13841236 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 20DB4D3C550 for ; Fri, 18 Oct 2024 06:41:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E2D9A6B0099; Fri, 18 Oct 2024 02:41:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DB4736B009C; Fri, 18 Oct 2024 02:41:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B94466B009D; Fri, 18 Oct 2024 02:41:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 7F51C6B0099 for ; Fri, 18 Oct 2024 02:41:13 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 155A31C6F67 for ; Fri, 18 Oct 2024 06:41:00 +0000 (UTC) X-FDA: 82685775690.26.FF38FE7 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) by imf27.hostedemail.com (Postfix) with ESMTP id 6686440008 for ; Fri, 18 Oct 2024 06:41:00 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=eridgetG; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf27.hostedemail.com: domain of kanchana.p.sridhar@intel.com designates 192.198.163.15 as permitted sender) smtp.mailfrom=kanchana.p.sridhar@intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729233552; a=rsa-sha256; cv=none; b=hXDeHAugcq8SPhaJ5YOQdh+DLcxee/RbIZedgOyYbgYxvawhsDaQ6b+UrPjqI33xCUU4oU TFHelwHBzdrlD40KJ3oqtdKbef7Nag7/jraxae08eVcB9+yfz10cEiXR0V7LkX2XW24dar C6TGaQAdDS9kOYEWRvYRk00w6WSFqo0= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=eridgetG; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf27.hostedemail.com: domain of kanchana.p.sridhar@intel.com designates 192.198.163.15 as permitted sender) smtp.mailfrom=kanchana.p.sridhar@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729233552; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1cCwxHrWTh0Fjc9IYlgOU3POgtAU1eNjfTY1o6LL8NI=; b=Rov+elM/hmUikohRvXkPN/fYHDoTxgItVDXmrksFb7pQPjAGoPmRPbT/+BtmGYOBN8VALe 4//C7K5HLgv92Mz58ortmfK0mZ+cG/IgvLkrnodzFknxM988GHUrVyhS3SwDa8ALXRqTQS 9C3+oyrGjdqkEbIsvzxU+P4DSr5XYy4= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1729233671; x=1760769671; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Qqxb4WHT8bprxcMFQO92c1NKetq9casrAYfBRMVL0vQ=; b=eridgetGJ+xcJU/tRREy2BvT+D40sHScrtyK3oQ4G1w/xaO3zlL450H0 qEPUuhS+PW4OhlO9T5IOIhrjTSC6PZ6tyZ9Sbpv6qrDfzCtbpIdpws7bA /bxBic4nIJAt3HzldjQdivB7My3KDuXWYG0Xcx5DXq3URpIA43Xj1FDck ylTUXjn6FxDpwLr/FM7f4qDezwEtoQkQfJwSM2RbE+s/kQ4nj8SLp4kK/ hMLy/lBxgQA4x5AQ0PhL6A219Y3ZuZwqClRdmqb/HEgKlzm87+LPCH2/A JF26O4qUImqW8Mbd7tvRkp/siONs56XfRwcibKBBhtI8Ao8ex6oGCs67f Q==; X-CSE-ConnectionGUID: Zz0Bvj+OQ1SfyN2HnCP6bA== X-CSE-MsgGUID: PW3nBt7CSbuydBlug38cgw== X-IronPort-AV: E=McAfee;i="6700,10204,11228"; a="28884910" X-IronPort-AV: E=Sophos;i="6.11,212,1725346800"; d="scan'208";a="28884910" Received: from orviesa003.jf.intel.com ([10.64.159.143]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Oct 2024 23:41:03 -0700 X-CSE-ConnectionGUID: KzCXCTJiQFKm7BYr/a601A== X-CSE-MsgGUID: e8EgEA8oSM6ZXNXDpKJJHQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.11,199,1725346800"; d="scan'208";a="83607525" Received: from jf5300-b11a338t.jf.intel.com ([10.242.51.6]) by orviesa003.jf.intel.com with ESMTP; 17 Oct 2024 23:41:03 -0700 From: Kanchana P Sridhar To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, hannes@cmpxchg.org, yosryahmed@google.com, nphamcs@gmail.com, chengming.zhou@linux.dev, usamaarif642@gmail.com, ryan.roberts@arm.com, ying.huang@intel.com, 21cnbao@gmail.com, akpm@linux-foundation.org, linux-crypto@vger.kernel.org, herbert@gondor.apana.org.au, davem@davemloft.net, clabbe@baylibre.com, ardb@kernel.org, ebiggers@google.com, surenb@google.com, kristen.c.accardi@intel.com, zanussi@kernel.org, viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz, mcgrof@kernel.org, kees@kernel.org, joel.granados@kernel.org, bfoster@redhat.com, willy@infradead.org, linux-fsdevel@vger.kernel.org Cc: wajdi.k.feghali@intel.com, vinodh.gopal@intel.com, kanchana.p.sridhar@intel.com Subject: [RFC PATCH v1 09/13] mm: zswap: Config variable to enable compress batching in zswap_store(). Date: Thu, 17 Oct 2024 23:40:57 -0700 Message-Id: <20241018064101.336232-10-kanchana.p.sridhar@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20241018064101.336232-1-kanchana.p.sridhar@intel.com> References: <20241018064101.336232-1-kanchana.p.sridhar@intel.com> MIME-Version: 1.0 X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 6686440008 X-Stat-Signature: b5fe4n3nipxmj44g3pnf6jyxp1je45ht X-Rspam-User: X-HE-Tag: 1729233660-59886 X-HE-Meta: U2FsdGVkX19NQ3FgCUug6Zj3n8SFhNCqf7dY1e0K0WOXbdr77OjVmkK5DdkEJruuUjgYPodYLLmLLWXTVdW2dqr9YWkSHwqFJ5oQ4fpPt+/RGLW/P4HtD2khJDYAHGxlp+7gbwD2y1UBljwenWg19ljAyf8UC2ntde2AjeuESbkW/C3JGQHVVnWS0AGxWl6OQQNaTBW4Vfj3W+ubRtO1f1ZsRMjLXQyBQ3mlrP7sqghjnKpKAFdc7hAmEajSHJdISGQOiT5xinToMVWqNcY0S6gAVufOwEH5H1XMNkBt6PBfuczWXbd1g88tlsHsKGyea0aOBINVLd7Qmb7r+FaKLzAT2NsC0thI1V4al86s05U52KbNHrObWRaDyAoOLpGGwmSoiachPZPfBC0XxfrB2n/P5Ozh7ghEhr8IBlLRk4q2DDB/PeuYyl5KCWN9T81ldIPH662M/l1PC0D+dt6TTkw9R10kAsHMGSu5apZLVcp4vmgdRkxi7EbsedDYGcVfcR60n80wl7Rww7x3aflweGGe7ci7AFq/tzeTP9ZR3VjfKyXqOjr0N7VRScxMfvcKVwfytK++uMELdlYrhDy4orJFybF3WBtaYYH7C6x2/LCrFAVWKD8j+swYRla4FDaCTPPaz+Cvmg51C8Wc7/aOYaf8K45cL8nax2wOjNVBQxju8gbMil3kA2dZeTno2vzTykgncqhe63GO5LMM+b8NJbT12Yedevav1DM9zUwrRJO9h8z+mpPmA/kMiWGbcqbtg5VD75B7xms7/UU84+wVRfJuClBWlnHgQzbF1q1fTusETncvxoeb5teW5hG57oxu8p9joZI4qHHylD8Wd5H6N+NA1KBbTsMInbrRss+BclKFixX4De9sTWuA44XO8DZbqgYE1IiNqQoSefFcm6ZPsUEzRiApygbdPdvd/QaTSe+jTgBnL7pGAlgRKvbobYRT1/kUPJaI4IcagNNaFUB Nx6TrGpC 6c8nuBsT8448FEJJecOV1A6CpXfzdJjsT53wWGaFbWjyIbVSUq3G6YRScO0OAmp0ieivNX5s7JzPF7+3reViUhmH9YxLC/UGpWye18g9nC2EBW7MpG85JfKKVeXw9AN5jW0wHAONKfLoZ6R6wMTlLmJZbpIRJ1yW5YsiMahRNM9/lS0idpASGFEjlNFVPZ8SriGURXpyKDm2F7/30pmns/QHOHpffkGGjeONE X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add a new zswap config variable that controls whether zswap_store() will compress a batch of pages, for instance, the pages in a large folio: CONFIG_ZSWAP_STORE_BATCHING_ENABLED The existing CONFIG_CRYPTO_DEV_IAA_CRYPTO variable added in commit ea7a5cbb4369 ("crypto: iaa - Add Intel IAA Compression Accelerator crypto driver core") is used to detect if the system has the Intel Analytics Accelerator (IAA), and the iaa_crypto module is available. If so, the kernel build will prompt for CONFIG_ZSWAP_STORE_BATCHING_ENABLED. Hence, users have the ability to set CONFIG_ZSWAP_STORE_BATCHING_ENABLED="y" only on systems that have Intel IAA. If CONFIG_ZSWAP_STORE_BATCHING_ENABLED is enabled, and IAA is configured as the zswap compressor, zswap_store() will process the pages in a large folio in batches, i.e., multiple pages at a time. Pages in a batch will be compressed in parallel in hardware, then stored. On systems without Intel IAA and/or if zswap uses software compressors, pages in the batch will be compressed sequentially and stored. The patch also implements a zswap API that returns the status of this config variable. Suggested-by: Ying Huang Signed-off-by: Kanchana P Sridhar --- include/linux/zswap.h | 6 ++++++ mm/Kconfig | 12 ++++++++++++ mm/zswap.c | 14 ++++++++++++++ 3 files changed, 32 insertions(+) diff --git a/include/linux/zswap.h b/include/linux/zswap.h index d961ead91bf1..74ad2a24b309 100644 --- a/include/linux/zswap.h +++ b/include/linux/zswap.h @@ -24,6 +24,7 @@ struct zswap_lruvec_state { atomic_long_t nr_disk_swapins; }; +bool zswap_store_batching_enabled(void); unsigned long zswap_total_pages(void); bool zswap_store(struct folio *folio); bool zswap_load(struct folio *folio); @@ -39,6 +40,11 @@ bool zswap_never_enabled(void); struct zswap_lruvec_state {}; +static inline bool zswap_store_batching_enabled(void) +{ + return false; +} + static inline bool zswap_store(struct folio *folio) { return false; diff --git a/mm/Kconfig b/mm/Kconfig index 33fa51d608dc..26d1a5cee471 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -125,6 +125,18 @@ config ZSWAP_COMPRESSOR_DEFAULT default "zstd" if ZSWAP_COMPRESSOR_DEFAULT_ZSTD default "" +config ZSWAP_STORE_BATCHING_ENABLED + bool "Batching of zswap stores with Intel IAA" + depends on ZSWAP && CRYPTO_DEV_IAA_CRYPTO + default n + help + Enables zswap_store to swapout large folios in batches of 8 pages, + rather than a page at a time, if the system has Intel IAA for hardware + acceleration of compressions. If IAA is configured as the zswap + compressor, this will parallelize batch compression of upto 8 pages + in the folio in hardware, thereby improving large folio compression + throughput and reducing swapout latency. + choice prompt "Default allocator" depends on ZSWAP diff --git a/mm/zswap.c b/mm/zswap.c index 948c9745ee57..4893302d8c34 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -127,6 +127,15 @@ static bool zswap_shrinker_enabled = IS_ENABLED( CONFIG_ZSWAP_SHRINKER_DEFAULT_ON); module_param_named(shrinker_enabled, zswap_shrinker_enabled, bool, 0644); +/* + * Enable/disable batching of compressions if zswap_store is called with a + * large folio. If enabled, and if IAA is the zswap compressor, pages are + * compressed in parallel in batches of say, 8 pages. + * If not, every page is compressed sequentially. + */ +static bool __zswap_store_batching_enabled = IS_ENABLED( + CONFIG_ZSWAP_STORE_BATCHING_ENABLED); + bool zswap_is_enabled(void) { return zswap_enabled; @@ -241,6 +250,11 @@ static inline struct xarray *swap_zswap_tree(swp_entry_t swp) pr_debug("%s pool %s/%s\n", msg, (p)->tfm_name, \ zpool_get_type((p)->zpool)) +__always_inline bool zswap_store_batching_enabled(void) +{ + return __zswap_store_batching_enabled; +} + /********************************* * pool functions **********************************/ From patchwork Fri Oct 18 06:40:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Sridhar, Kanchana P" X-Patchwork-Id: 13841237 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BE416D3C552 for ; Fri, 18 Oct 2024 06:41:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 294806B009B; Fri, 18 Oct 2024 02:41:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 15ABE6B009E; Fri, 18 Oct 2024 02:41:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CF0DB6B009B; Fri, 18 Oct 2024 02:41:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id ACE556B009C for ; Fri, 18 Oct 2024 02:41:13 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 45113140599 for ; Fri, 18 Oct 2024 06:41:01 +0000 (UTC) X-FDA: 82685775774.16.7FACE7F Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) by imf24.hostedemail.com (Postfix) with ESMTP id 2C54918000E for ; Fri, 18 Oct 2024 06:41:08 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=c4HrQAdh; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf24.hostedemail.com: domain of kanchana.p.sridhar@intel.com designates 192.198.163.15 as permitted sender) smtp.mailfrom=kanchana.p.sridhar@intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729233563; a=rsa-sha256; cv=none; b=Ya0xaFLpdgr6rgZW/Zn0aiY6TlFncUyqFY9okVZA+zc/REzc4BoDwHMfGWjYIIwKZf8ydV AgvJH8v9xWP1cxy2kkHFmtSxfF9EUGZ+Fba/UjahLKQzZm5JIIpczgup5sxjY+/1sSAKb8 /Zey4FKckPYHsJf651apYniAsO+YXxE= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=c4HrQAdh; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf24.hostedemail.com: domain of kanchana.p.sridhar@intel.com designates 192.198.163.15 as permitted sender) smtp.mailfrom=kanchana.p.sridhar@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729233563; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=defscYOWfMfR/FarNwYsJbugfsxcXochzS4X58iXci4=; b=AiPzwWROfRZN7AoET62Q9MKrsAmpj2bUAuySnePmKcIPRA8e/mGkaFecJaiWRVFRAjcuGs Rhu5TeuPrzvuY3HQdeyifq29o/xQAgZ/JBlUWOApYcULzGzZ35rACNQj59islXLxM7fExg biarcrHjcV2mJquo1Q/tQ/4akWwKNMI= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1729233671; x=1760769671; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=+Q7iJvK2+hp6eB2kPVjEewMUFUZIdF/DxKq670u2u9s=; b=c4HrQAdh8PHiBZc9vInMg2u/WoTGEwKNDLOr4fprJv5oYrLkncma2IIT YKHwN1NFYLBdBy/w7jiPTJLLB9aXRl7lNndqkY05E5fd9iUaHda0aXcRE m7qunqiPqKxXITO5ca9YOzFsiTZXUHloVtrDneIoWc+MdnQMrvMSgVYdK XNFNuU5kdNK1/X02hryCodUwizheUZKlZBzv6V1hWKZ8U4d/VmTDHb5A6 N2bxzuUCv+AQ8MXWhqKhHk94DB1IwQmoMSKeYH8N4bVwrkM7PrNnvI63G eTU984AMSifOcgvBt0HKl9eia22Jk30/JNvqQDki3cZ7Dbh5r1zJcFQ8G g==; X-CSE-ConnectionGUID: igG5AbQ7Skymr8GnwnKdag== X-CSE-MsgGUID: UP2W574WQDirdpliBU7FJg== X-IronPort-AV: E=McAfee;i="6700,10204,11228"; a="28884928" X-IronPort-AV: E=Sophos;i="6.11,212,1725346800"; d="scan'208";a="28884928" Received: from orviesa003.jf.intel.com ([10.64.159.143]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Oct 2024 23:41:04 -0700 X-CSE-ConnectionGUID: 8bTgRIQISyO43aali6d8vQ== X-CSE-MsgGUID: bqFAXIkNSPSN+aHIWJjjpw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.11,199,1725346800"; d="scan'208";a="83607529" Received: from jf5300-b11a338t.jf.intel.com ([10.242.51.6]) by orviesa003.jf.intel.com with ESMTP; 17 Oct 2024 23:41:04 -0700 From: Kanchana P Sridhar To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, hannes@cmpxchg.org, yosryahmed@google.com, nphamcs@gmail.com, chengming.zhou@linux.dev, usamaarif642@gmail.com, ryan.roberts@arm.com, ying.huang@intel.com, 21cnbao@gmail.com, akpm@linux-foundation.org, linux-crypto@vger.kernel.org, herbert@gondor.apana.org.au, davem@davemloft.net, clabbe@baylibre.com, ardb@kernel.org, ebiggers@google.com, surenb@google.com, kristen.c.accardi@intel.com, zanussi@kernel.org, viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz, mcgrof@kernel.org, kees@kernel.org, joel.granados@kernel.org, bfoster@redhat.com, willy@infradead.org, linux-fsdevel@vger.kernel.org Cc: wajdi.k.feghali@intel.com, vinodh.gopal@intel.com, kanchana.p.sridhar@intel.com Subject: [RFC PATCH v1 10/13] mm: zswap: Create multiple reqs/buffers in crypto_acomp_ctx if platform has IAA. Date: Thu, 17 Oct 2024 23:40:58 -0700 Message-Id: <20241018064101.336232-11-kanchana.p.sridhar@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20241018064101.336232-1-kanchana.p.sridhar@intel.com> References: <20241018064101.336232-1-kanchana.p.sridhar@intel.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 2C54918000E X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: dbspdwtowbam193g8fwxe3yb5ocp4qk7 X-HE-Tag: 1729233668-944185 X-HE-Meta: U2FsdGVkX1/qe7lWs2enivnBeGWpMO71tMUM06Ts97E/AzY8f2DZGX8/UpW+LOLtNrITC+B8egaJhcJuo0LAJFswpQWLmqQrc8EPfnjXCQeKrWw1nWgxn/GBiX0+Nv3PMna8A7WJnC92EaT6To8bA/PCnCv5ekG81PYdykjmC+0im6/O5GRBoKztcIg505937XAlrKifVfRcdd/+Nquz68jv73MhtktpVHupB8kBh10EZ3FwUh/Tmove+oMN7rfQwKldDYnzA5n1p4H54CzC9qF7gH7GyiSOLNxDGjgQddxQJYeoOseiVMnMkl9XqKXzd6azorFkDkdotkGo9agPclJurP4vXauMlANQGdNsxZ9ob1K6T5PgprfW2BZuFL8AU2tfvuB0VLlK+YG4bJ6+Ol/xZH9rpwa/pG59inACzUqwKcnsacOkXPQe2+qvOyZyQWftWpQEdrHnKGdM8+/Bsfc+affX+tgrgaBHFVuWGAJ+D+0dafIjRTlm36MiXIiddFB3y46lIsQSgx86+mCsuJRAwpWcMkD8jOWRBUtGxXjtMhxnsYVwLmaNeSwdZrMoJNj0VEASagKduTfRzmtMW6trX9fRhC9s3mPR1x2OAhcj7/kFZ8DxJEjMyjw9HDv5iJiBmsnIrVVUwfwEAmyAQpgjduV+KQbF3t6xBmB1Jy5SFgxj+XMRsXBoxY3aeYuKcelL/czavXFn3FpRYrwVCObDUADIiiu45v5JHK3jvV0uooOehwFnyWEGRmAWajzRlUaA4pmnPJzG70B/9aiNYP6bX4QPfEdRRBqw87t7oeWuJvJhGDb/FXht+L25TOczJCwg5m+HgbGlXgSYSk0foN0/f1Q4VfJ3WzPmQLWW+RwllO7D+Hz0LUCUd+g2aHzkRhQaB1ZVTQ7flva3Dy5d3j6383J+hhlRKYUQdhqx/nTU3WoNrEFsRK679uzPGoaCkQGtcbyDE9raZIlDLAW rkteqcjK rMmXbmR6ngbNq9HAUdqIvrIdflLvXbSUurMMh0hYLuowHgoVd4tv0Suei1ZirvRqNRU+wt70dd7VgY+jFvfzdqbgU1OOt1xcya9vqrp4mTMfZgPK2tYvZkBV2NaOKNMTkcGViF+mGLGoTVBeTqbIOL2pkHcMDsHEFmQDDOMC9fJ83bZn2RiS8vRQ0mZOsXjFg90iHNJvj8QMDUkXlTS1nztdFdUWo9psLVWwSKl8ZytpanSg= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Intel IAA hardware acceleration can be used effectively to improve the zswap_store() performance of large folios by batching multiple pages in a folio to be compressed in parallel by IAA. Hence, to build compress batching of zswap large folio stores using IAA, we need to be able to submit a batch of compress jobs from zswap to the hardware to compress in parallel if the iaa_crypto "async" mode is used. The IAA compress batching paradigm works as follows: 1) Submit N crypto_acomp_compress() jobs using N requests. 2) Use the iaa_crypto driver async poll() method to check for the jobs to complete. 3) There are no ordering constraints implied by submission, hence we could loop through the requests and process any job that has completed. 4) This would repeat until all jobs have completed with success/error status. To facilitate this, we need to provide for multiple acomp_reqs in "struct crypto_acomp_ctx", each representing a distinct compress job. Likewise, there needs to be a distinct destination buffer corresponding to each acomp_req. If CONFIG_ZSWAP_STORE_BATCHING_ENABLED is enabled, this patch will set the SWAP_CRYPTO_SUB_BATCH_SIZE constant to 8UL. This implies each per-cpu crypto_acomp_ctx associated with the zswap_pool can submit up to 8 acomp_reqs at a time to accomplish parallel compressions. If IAA is not present and/or CONFIG_ZSWAP_STORE_BATCHING_ENABLED is not set, SWAP_CRYPTO_SUB_BATCH_SIZE will be set to 1UL. On an Intel Sapphire Rapids server, each socket has 4 IAA, each of which has 2 compress engines and 8 decompress engines. Experiments modeling a contended system with say 72 processes running under a cgroup with a fixed memory-limit, have shown that there is a significant performance improvement with dispatching compress jobs from all cores to all the IAA devices on the socket. Hence, SWAP_CRYPTO_SUB_BATCH_SIZE is set to 8 to maximize compression throughput if IAA is available. The definition of "struct crypto_acomp_ctx" is modified to make the req/buffer be arrays of size SWAP_CRYPTO_SUB_BATCH_SIZE. Thus, the added memory footprint cost of this per-cpu structure for batching is incurred only for platforms that have Intel IAA. Suggested-by: Ying Huang Signed-off-by: Kanchana P Sridhar --- mm/swap.h | 11 ++++++ mm/zswap.c | 104 ++++++++++++++++++++++++++++++++++------------------- 2 files changed, 78 insertions(+), 37 deletions(-) diff --git a/mm/swap.h b/mm/swap.h index ad2f121de970..566616c971d4 100644 --- a/mm/swap.h +++ b/mm/swap.h @@ -8,6 +8,17 @@ struct mempolicy; #include /* for swp_offset */ #include /* for bio_end_io_t */ +/* + * For IAA compression batching: + * Maximum number of IAA acomp compress requests that will be processed + * in a sub-batch. + */ +#if defined(CONFIG_ZSWAP_STORE_BATCHING_ENABLED) +#define SWAP_CRYPTO_SUB_BATCH_SIZE 8UL +#else +#define SWAP_CRYPTO_SUB_BATCH_SIZE 1UL +#endif + /* linux/mm/page_io.c */ int sio_pool_init(void); struct swap_iocb; diff --git a/mm/zswap.c b/mm/zswap.c index 4893302d8c34..579869d1bdf6 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -152,9 +152,9 @@ bool zswap_never_enabled(void) struct crypto_acomp_ctx { struct crypto_acomp *acomp; - struct acomp_req *req; + struct acomp_req *req[SWAP_CRYPTO_SUB_BATCH_SIZE]; + u8 *buffer[SWAP_CRYPTO_SUB_BATCH_SIZE]; struct crypto_wait wait; - u8 *buffer; struct mutex mutex; bool is_sleepable; }; @@ -832,49 +832,64 @@ static int zswap_cpu_comp_prepare(unsigned int cpu, struct hlist_node *node) struct zswap_pool *pool = hlist_entry(node, struct zswap_pool, node); struct crypto_acomp_ctx *acomp_ctx = per_cpu_ptr(pool->acomp_ctx, cpu); struct crypto_acomp *acomp; - struct acomp_req *req; int ret; + int i, j; mutex_init(&acomp_ctx->mutex); - acomp_ctx->buffer = kmalloc_node(PAGE_SIZE * 2, GFP_KERNEL, cpu_to_node(cpu)); - if (!acomp_ctx->buffer) - return -ENOMEM; - acomp = crypto_alloc_acomp_node(pool->tfm_name, 0, 0, cpu_to_node(cpu)); if (IS_ERR(acomp)) { pr_err("could not alloc crypto acomp %s : %ld\n", pool->tfm_name, PTR_ERR(acomp)); - ret = PTR_ERR(acomp); - goto acomp_fail; + return PTR_ERR(acomp); } acomp_ctx->acomp = acomp; acomp_ctx->is_sleepable = acomp_is_async(acomp); - req = acomp_request_alloc(acomp_ctx->acomp); - if (!req) { - pr_err("could not alloc crypto acomp_request %s\n", - pool->tfm_name); - ret = -ENOMEM; - goto req_fail; + for (i = 0; i < SWAP_CRYPTO_SUB_BATCH_SIZE; ++i) { + acomp_ctx->buffer[i] = kmalloc_node(PAGE_SIZE * 2, + GFP_KERNEL, cpu_to_node(cpu)); + if (!acomp_ctx->buffer[i]) { + for (j = 0; j < i; ++j) + kfree(acomp_ctx->buffer[j]); + ret = -ENOMEM; + goto buf_fail; + } + } + + for (i = 0; i < SWAP_CRYPTO_SUB_BATCH_SIZE; ++i) { + acomp_ctx->req[i] = acomp_request_alloc(acomp_ctx->acomp); + if (!acomp_ctx->req[i]) { + pr_err("could not alloc crypto acomp_request req[%d] %s\n", + i, pool->tfm_name); + for (j = 0; j < i; ++j) + acomp_request_free(acomp_ctx->req[j]); + ret = -ENOMEM; + goto req_fail; + } } - acomp_ctx->req = req; + /* + * The crypto_wait is used only in fully synchronous, i.e., with scomp + * or non-poll mode of acomp, hence there is only one "wait" per + * acomp_ctx, with callback set to req[0]. + */ crypto_init_wait(&acomp_ctx->wait); /* * if the backend of acomp is async zip, crypto_req_done() will wakeup * crypto_wait_req(); if the backend of acomp is scomp, the callback * won't be called, crypto_wait_req() will return without blocking. */ - acomp_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG, + acomp_request_set_callback(acomp_ctx->req[0], CRYPTO_TFM_REQ_MAY_BACKLOG, crypto_req_done, &acomp_ctx->wait); return 0; req_fail: + for (i = 0; i < SWAP_CRYPTO_SUB_BATCH_SIZE; ++i) + kfree(acomp_ctx->buffer[i]); +buf_fail: crypto_free_acomp(acomp_ctx->acomp); -acomp_fail: - kfree(acomp_ctx->buffer); return ret; } @@ -884,11 +899,17 @@ static int zswap_cpu_comp_dead(unsigned int cpu, struct hlist_node *node) struct crypto_acomp_ctx *acomp_ctx = per_cpu_ptr(pool->acomp_ctx, cpu); if (!IS_ERR_OR_NULL(acomp_ctx)) { - if (!IS_ERR_OR_NULL(acomp_ctx->req)) - acomp_request_free(acomp_ctx->req); + int i; + + for (i = 0; i < SWAP_CRYPTO_SUB_BATCH_SIZE; ++i) + if (!IS_ERR_OR_NULL(acomp_ctx->req[i])) + acomp_request_free(acomp_ctx->req[i]); + + for (i = 0; i < SWAP_CRYPTO_SUB_BATCH_SIZE; ++i) + kfree(acomp_ctx->buffer[i]); + if (!IS_ERR_OR_NULL(acomp_ctx->acomp)) crypto_free_acomp(acomp_ctx->acomp); - kfree(acomp_ctx->buffer); } return 0; @@ -911,7 +932,7 @@ static bool zswap_compress(struct page *page, struct zswap_entry *entry, mutex_lock(&acomp_ctx->mutex); - dst = acomp_ctx->buffer; + dst = acomp_ctx->buffer[0]; sg_init_table(&input, 1); sg_set_page(&input, page, PAGE_SIZE, 0); @@ -921,7 +942,7 @@ static bool zswap_compress(struct page *page, struct zswap_entry *entry, * giving the dst buffer with enough length to avoid buffer overflow. */ sg_init_one(&output, dst, PAGE_SIZE * 2); - acomp_request_set_params(acomp_ctx->req, &input, &output, PAGE_SIZE, dlen); + acomp_request_set_params(acomp_ctx->req[0], &input, &output, PAGE_SIZE, dlen); /* * If the crypto_acomp provides an asynchronous poll() interface, @@ -940,19 +961,20 @@ static bool zswap_compress(struct page *page, struct zswap_entry *entry, * parallel. */ if (acomp_ctx->acomp->poll) { - comp_ret = crypto_acomp_compress(acomp_ctx->req); + comp_ret = crypto_acomp_compress(acomp_ctx->req[0]); if (comp_ret == -EINPROGRESS) { do { - comp_ret = crypto_acomp_poll(acomp_ctx->req); + comp_ret = crypto_acomp_poll(acomp_ctx->req[0]); if (comp_ret && comp_ret != -EAGAIN) break; } while (comp_ret); } } else { - comp_ret = crypto_wait_req(crypto_acomp_compress(acomp_ctx->req), &acomp_ctx->wait); + comp_ret = crypto_wait_req(crypto_acomp_compress(acomp_ctx->req[0]), + &acomp_ctx->wait); } - dlen = acomp_ctx->req->dlen; + dlen = acomp_ctx->req[0]->dlen; if (comp_ret) goto unlock; @@ -1006,31 +1028,39 @@ static void zswap_decompress(struct zswap_entry *entry, struct folio *folio) */ if ((acomp_ctx->is_sleepable && !zpool_can_sleep_mapped(zpool)) || !virt_addr_valid(src)) { - memcpy(acomp_ctx->buffer, src, entry->length); - src = acomp_ctx->buffer; + memcpy(acomp_ctx->buffer[0], src, entry->length); + src = acomp_ctx->buffer[0]; zpool_unmap_handle(zpool, entry->handle); } sg_init_one(&input, src, entry->length); sg_init_table(&output, 1); sg_set_folio(&output, folio, PAGE_SIZE, 0); - acomp_request_set_params(acomp_ctx->req, &input, &output, entry->length, PAGE_SIZE); + acomp_request_set_params(acomp_ctx->req[0], &input, &output, + entry->length, PAGE_SIZE); if (acomp_ctx->acomp->poll) { - ret = crypto_acomp_decompress(acomp_ctx->req); + ret = crypto_acomp_decompress(acomp_ctx->req[0]); if (ret == -EINPROGRESS) { do { - ret = crypto_acomp_poll(acomp_ctx->req); + ret = crypto_acomp_poll(acomp_ctx->req[0]); BUG_ON(ret && ret != -EAGAIN); } while (ret); } } else { - BUG_ON(crypto_wait_req(crypto_acomp_decompress(acomp_ctx->req), &acomp_ctx->wait)); + BUG_ON(crypto_wait_req(crypto_acomp_decompress(acomp_ctx->req[0]), + &acomp_ctx->wait)); } - BUG_ON(acomp_ctx->req->dlen != PAGE_SIZE); - mutex_unlock(&acomp_ctx->mutex); + BUG_ON(acomp_ctx->req[0]->dlen != PAGE_SIZE); - if (src != acomp_ctx->buffer) + if (src != acomp_ctx->buffer[0]) zpool_unmap_handle(zpool, entry->handle); + + /* + * It is safer to unlock the mutex after the check for + * "src != acomp_ctx->buffer[0]" so that the value of "src" + * does not change. + */ + mutex_unlock(&acomp_ctx->mutex); } /********************************* From patchwork Fri Oct 18 06:40:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Sridhar, Kanchana P" X-Patchwork-Id: 13841238 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EED21D3C550 for ; Fri, 18 Oct 2024 06:41:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7F6CA6B009C; Fri, 18 Oct 2024 02:41:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6ADFA6B009D; Fri, 18 Oct 2024 02:41:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 501AB6B009E; Fri, 18 Oct 2024 02:41:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 27A566B009C for ; Fri, 18 Oct 2024 02:41:15 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 957CCAC799 for ; Fri, 18 Oct 2024 06:40:51 +0000 (UTC) X-FDA: 82685775900.04.998BD6A Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) by imf19.hostedemail.com (Postfix) with ESMTP id 6610D1A0017 for ; Fri, 18 Oct 2024 06:40:59 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=GW+Sqmmg; spf=pass (imf19.hostedemail.com: domain of kanchana.p.sridhar@intel.com designates 192.198.163.15 as permitted sender) smtp.mailfrom=kanchana.p.sridhar@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729233526; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=6oPOptHy1TMVMHa9IycIcMocFy/J8hGQChnYgpe95JU=; b=iH8OFAYu/pTSaA97A3o+Jmm8Sx08DtvbfWO7SsBNUODtVoVpeognuYmbJg+snOTwB576vs 9h5osZl1KA9vl4R1mpGjzFZD3BVkdqkqhOrw51VqQ9bhrcJdZyvn75TkEFerdPos+tc4Tm fDo3GJ7T3GS6pfdud4hfOnTgv1S0nfg= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729233526; a=rsa-sha256; cv=none; b=mrjcO9u8xWiA5eJdTBqKWojS9rSU09np/sUJ8oQyc45zm+v411nyafRGejqZKjudgWzx16 NQnxh7lyZocSsd93+IUAKtRPk4ssmsw8hLaKg10463FEHYiSU2/8wpbU78OPe8ieH03d61 ezD4id3VynH+ewuHa1Itzk6Vu6P+0As= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=GW+Sqmmg; spf=pass (imf19.hostedemail.com: domain of kanchana.p.sridhar@intel.com designates 192.198.163.15 as permitted sender) smtp.mailfrom=kanchana.p.sridhar@intel.com; dmarc=pass (policy=none) header.from=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1729233673; x=1760769673; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ehxQAVad+zYWKv7x3rvdIyaL5PCorY7OSY9LU2CsBiY=; b=GW+SqmmgFLFx/4YeKFMM3+tXB/2lkQqQBdZT10ZFlkYeegpzvG7Ei27D ycpChiXpZdim3Sekweg75+v9rEt0hQDI5rkQFUOzKWYzP8/jHM53K4QIE f3439H5uT1X1JVRwnGi9FiGgSROb1JyA/DaGbs3ZNSiUmUxcrwdOB3MzD I4DrAJzBq6mTaf6AnKSex8AlfnH49f0Vqs5WfGxvJ5BaQIEoExCJbqaJt PQoHBVkz2Lfz/sfM1wh2QkXznVxjTJ+V3xv4EOS9Ewjo6Z699g0CDIaMT dkGas68tJDW2KromvvTspJz8tozMEKx5YHR/mPLZrNaqSwUz5FyBLBnWh g==; X-CSE-ConnectionGUID: tcKG0LrkTnmmAYeN/6OLbg== X-CSE-MsgGUID: IDQq8eNRStClVi38JsohKw== X-IronPort-AV: E=McAfee;i="6700,10204,11228"; a="28884948" X-IronPort-AV: E=Sophos;i="6.11,212,1725346800"; d="scan'208";a="28884948" Received: from orviesa003.jf.intel.com ([10.64.159.143]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Oct 2024 23:41:04 -0700 X-CSE-ConnectionGUID: PBvCfG+vRvKunH74mxz5gg== X-CSE-MsgGUID: nJbh3AsKSIKgmJHgOz/6cA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.11,199,1725346800"; d="scan'208";a="83607534" Received: from jf5300-b11a338t.jf.intel.com ([10.242.51.6]) by orviesa003.jf.intel.com with ESMTP; 17 Oct 2024 23:41:04 -0700 From: Kanchana P Sridhar To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, hannes@cmpxchg.org, yosryahmed@google.com, nphamcs@gmail.com, chengming.zhou@linux.dev, usamaarif642@gmail.com, ryan.roberts@arm.com, ying.huang@intel.com, 21cnbao@gmail.com, akpm@linux-foundation.org, linux-crypto@vger.kernel.org, herbert@gondor.apana.org.au, davem@davemloft.net, clabbe@baylibre.com, ardb@kernel.org, ebiggers@google.com, surenb@google.com, kristen.c.accardi@intel.com, zanussi@kernel.org, viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz, mcgrof@kernel.org, kees@kernel.org, joel.granados@kernel.org, bfoster@redhat.com, willy@infradead.org, linux-fsdevel@vger.kernel.org Cc: wajdi.k.feghali@intel.com, vinodh.gopal@intel.com, kanchana.p.sridhar@intel.com Subject: [RFC PATCH v1 11/13] mm: swap: Add IAA batch compression API swap_crypto_acomp_compress_batch(). Date: Thu, 17 Oct 2024 23:40:59 -0700 Message-Id: <20241018064101.336232-12-kanchana.p.sridhar@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20241018064101.336232-1-kanchana.p.sridhar@intel.com> References: <20241018064101.336232-1-kanchana.p.sridhar@intel.com> MIME-Version: 1.0 X-Stat-Signature: 65zoahs1fmzttkcjd5pwafuebd5m1gtg X-Rspamd-Queue-Id: 6610D1A0017 X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1729233659-141190 X-HE-Meta: U2FsdGVkX192qVtzymKqQcR2mti3OhsAvec5wFGtda+r2qfX4tMGXNff4av9HP7PsoJpDH4jRFQq+E5RVmoioEZiL8cIcKK4cN282sgVk0KXZhk5ct4Cv40lz58T/v2YvtHJKflOuKA7j9vPESOtdBoxSs538TySdmYcOQ1hb847hSXZUkDhgFP+h0VDDKh74VeaaH6rTSg/Ipbnpl2kCfSlMe/0b8Gie3+bsohmbOxYRX32P4ngAxniSqcj3D7TtNwn860s58RtCz/+fCp0V1Hq5Kl7xS93VdBxsoQ93TB4qpXYTr/aEElgcu62RLSl8blNbJD0pJCnm5jrgMaffqypx34ZFCNCBDuzdgzwjU6I0aFwYRs0jdV7rp2WdjGHpTHXp7TDyHPIpOu3Z6PJaFTKgiRMggs2UdvHAui429HPVUeuvy4Q7O8ievJGNrNAUdbS5GDXFFmeFyp4otej/UHsvAEjDpVmxFvzydsx8NGsz2Sd1RkO73mUHQOEdoPhLIT2AlkmVrMKljKdzxmPd2+iGY15QTWgA1IWyWkflyfT25Y2ED9mQY/ehqQFlpJSOYzoeQcaAcIjEykujHBAU7Y2yBOZ8xRaW1jF62aAjweck/fE37e5GUo0WWU4zf9ikxpEdlhqs31vmZtgSfAqnf+Oq/0UGtxa3M66/5hw+kl0pQabU2wrJwIE6Oqs+EcwE3aE8hkWi3c77ARUolPz4Lj8lve+9yZLkWWYuDLCLiuquIFbwHlEE1GHINGMx/CcX4a3knKSISEIWenWdTE8CKssV9GdWbjyZn7qvcUBfNBzBmxub9DrESYHu0DHfrbTmhU0LxIUa5SGK3sHDOb8qKkD7z90FuOeOzSGkIRi1NgVR2LhethCXoUccDTrwD+jEyyj9uvTmdNjOtHlGy3nN/P/uHvTxfmytHwOhKKuwuGNch45Atj3wKrZQBxHjwBf8CylpyA39Z4REwYD5+/ IibUmPH3 kHBNGv1iZTBIQl5FdL1+hSucf0Il0BwwxQqHbhgg8EPCBN0qgrKctR89ohYC+02+Faheq5SYaYawCbpKbYf4gKEHlFlIaBd5mZXF81nyUSMKYSinsONdIycp6N0P17fpA4KopT95WY50ROUS4p4SZOWNeZuXdxaLmIQmua/jFP5eQh2I+O81rxNXJV9JqqOLJ0zyMlH8lXyfVvaxObDJjAdz8DyzSmd3y55jFf+UPts/ii3M= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Added a new API swap_crypto_acomp_compress_batch() that does batch compression. A system that has Intel IAA can avail of this API to submit a batch of compress jobs for parallel compression in the hardware, to improve performance. On a system without IAA, this API will process each compress job sequentially. The purpose of this API is to be invocable from any swap module that needs to compress large folios, or a batch of pages in the general case. For instance, zswap would batch compress up to SWAP_CRYPTO_SUB_BATCH_SIZE (i.e. 8 if the system has IAA) pages in the large folio in parallel to improve zswap_store() performance. Towards this eventual goal: 1) The definition of "struct crypto_acomp_ctx" is moved to mm/swap.h so that mm modules like swap_state.c and zswap.c can reference it. 2) The swap_crypto_acomp_compress_batch() interface is implemented in swap_state.c. It would be preferable for "struct crypto_acomp_ctx" to be defined in, and for swap_crypto_acomp_compress_batch() to be exported via include/linux/swap.h so that modules outside mm (for e.g. zram) can potentially use the API for batch compressions with IAA. I would appreciate RFC comments on this. Signed-off-by: Kanchana P Sridhar --- mm/swap.h | 45 +++++++++++++++++++ mm/swap_state.c | 115 ++++++++++++++++++++++++++++++++++++++++++++++++ mm/zswap.c | 9 ---- 3 files changed, 160 insertions(+), 9 deletions(-) diff --git a/mm/swap.h b/mm/swap.h index 566616c971d4..4dcb67e2cc33 100644 --- a/mm/swap.h +++ b/mm/swap.h @@ -7,6 +7,7 @@ struct mempolicy; #ifdef CONFIG_SWAP #include /* for swp_offset */ #include /* for bio_end_io_t */ +#include /* * For IAA compression batching: @@ -19,6 +20,39 @@ struct mempolicy; #define SWAP_CRYPTO_SUB_BATCH_SIZE 1UL #endif +/* linux/mm/swap_state.c, zswap.c */ +struct crypto_acomp_ctx { + struct crypto_acomp *acomp; + struct acomp_req *req[SWAP_CRYPTO_SUB_BATCH_SIZE]; + u8 *buffer[SWAP_CRYPTO_SUB_BATCH_SIZE]; + struct crypto_wait wait; + struct mutex mutex; + bool is_sleepable; +}; + +/** + * This API provides IAA compress batching functionality for use by swap + * modules. + * The acomp_ctx mutex should be locked/unlocked before/after calling this + * procedure. + * + * @pages: Pages to be compressed. + * @dsts: Pre-allocated destination buffers to store results of IAA compression. + * @dlens: Will contain the compressed lengths. + * @errors: Will contain a 0 if the page was successfully compressed, or a + * non-0 error value to be processed by the calling function. + * @nr_pages: The number of pages, up to SWAP_CRYPTO_SUB_BATCH_SIZE, + * to be compressed. + * @acomp_ctx: The acomp context for iaa_crypto/other compressor. + */ +void swap_crypto_acomp_compress_batch( + struct page *pages[], + u8 *dsts[], + unsigned int dlens[], + int errors[], + int nr_pages, + struct crypto_acomp_ctx *acomp_ctx); + /* linux/mm/page_io.c */ int sio_pool_init(void); struct swap_iocb; @@ -119,6 +153,17 @@ static inline int swap_zeromap_batch(swp_entry_t entry, int max_nr, #else /* CONFIG_SWAP */ struct swap_iocb; +struct crypto_acomp_ctx {}; +static inline void swap_crypto_acomp_compress_batch( + struct page *pages[], + u8 *dsts[], + unsigned int dlens[], + int errors[], + int nr_pages, + struct crypto_acomp_ctx *acomp_ctx) +{ +} + static inline void swap_read_folio(struct folio *folio, struct swap_iocb **plug) { } diff --git a/mm/swap_state.c b/mm/swap_state.c index 4669f29cf555..117c3caa5679 100644 --- a/mm/swap_state.c +++ b/mm/swap_state.c @@ -23,6 +23,8 @@ #include #include #include +#include +#include #include "internal.h" #include "swap.h" @@ -742,6 +744,119 @@ void exit_swap_address_space(unsigned int type) swapper_spaces[type] = NULL; } +#ifdef CONFIG_SWAP + +/** + * This API provides IAA compress batching functionality for use by swap + * modules. + * The acomp_ctx mutex should be locked/unlocked before/after calling this + * procedure. + * + * @pages: Pages to be compressed. + * @dsts: Pre-allocated destination buffers to store results of IAA compression. + * @dlens: Will contain the compressed lengths. + * @errors: Will contain a 0 if the page was successfully compressed, or a + * non-0 error value to be processed by the calling function. + * @nr_pages: The number of pages, up to SWAP_CRYPTO_SUB_BATCH_SIZE, + * to be compressed. + * @acomp_ctx: The acomp context for iaa_crypto/other compressor. + */ +void swap_crypto_acomp_compress_batch( + struct page *pages[], + u8 *dsts[], + unsigned int dlens[], + int errors[], + int nr_pages, + struct crypto_acomp_ctx *acomp_ctx) +{ + struct scatterlist inputs[SWAP_CRYPTO_SUB_BATCH_SIZE]; + struct scatterlist outputs[SWAP_CRYPTO_SUB_BATCH_SIZE]; + bool compressions_done = false; + int i, j; + + BUG_ON(nr_pages > SWAP_CRYPTO_SUB_BATCH_SIZE); + + /* + * Prepare and submit acomp_reqs to IAA. + * IAA will process these compress jobs in parallel in async mode. + * If the compressor does not support a poll() method, or if IAA is + * used in sync mode, the jobs will be processed sequentially using + * acomp_ctx->req[0] and acomp_ctx->wait. + */ + for (i = 0; i < nr_pages; ++i) { + j = acomp_ctx->acomp->poll ? i : 0; + sg_init_table(&inputs[i], 1); + sg_set_page(&inputs[i], pages[i], PAGE_SIZE, 0); + + /* + * Each acomp_ctx->buffer[] is of size (PAGE_SIZE * 2). + * Reflect same in sg_list. + */ + sg_init_one(&outputs[i], dsts[i], PAGE_SIZE * 2); + acomp_request_set_params(acomp_ctx->req[j], &inputs[i], + &outputs[i], PAGE_SIZE, dlens[i]); + + /* + * If the crypto_acomp provides an asynchronous poll() + * interface, submit the request to the driver now, and poll for + * a completion status later, after all descriptors have been + * submitted. If the crypto_acomp does not provide a poll() + * interface, submit the request and wait for it to complete, + * i.e., synchronously, before moving on to the next request. + */ + if (acomp_ctx->acomp->poll) { + errors[i] = crypto_acomp_compress(acomp_ctx->req[j]); + + if (errors[i] != -EINPROGRESS) + errors[i] = -EINVAL; + else + errors[i] = -EAGAIN; + } else { + errors[i] = crypto_wait_req( + crypto_acomp_compress(acomp_ctx->req[j]), + &acomp_ctx->wait); + if (!errors[i]) + dlens[i] = acomp_ctx->req[j]->dlen; + } + } + + /* + * If not doing async compressions, the batch has been processed at + * this point and we can return. + */ + if (!acomp_ctx->acomp->poll) + return; + + /* + * Poll for and process IAA compress job completions + * in out-of-order manner. + */ + while (!compressions_done) { + compressions_done = true; + + for (i = 0; i < nr_pages; ++i) { + /* + * Skip, if the compression has already completed + * successfully or with an error. + */ + if (errors[i] != -EAGAIN) + continue; + + errors[i] = crypto_acomp_poll(acomp_ctx->req[i]); + + if (errors[i]) { + if (errors[i] == -EAGAIN) + compressions_done = false; + } else { + dlens[i] = acomp_ctx->req[i]->dlen; + } + } + } +} +EXPORT_SYMBOL_GPL(swap_crypto_acomp_compress_batch); + +#endif /* CONFIG_SWAP */ + static int swap_vma_ra_win(struct vm_fault *vmf, unsigned long *start, unsigned long *end) { diff --git a/mm/zswap.c b/mm/zswap.c index 579869d1bdf6..cab3114321f9 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -150,15 +150,6 @@ bool zswap_never_enabled(void) * data structures **********************************/ -struct crypto_acomp_ctx { - struct crypto_acomp *acomp; - struct acomp_req *req[SWAP_CRYPTO_SUB_BATCH_SIZE]; - u8 *buffer[SWAP_CRYPTO_SUB_BATCH_SIZE]; - struct crypto_wait wait; - struct mutex mutex; - bool is_sleepable; -}; - /* * The lock ordering is zswap_tree.lock -> zswap_pool.lru_lock. * The only case where lru_lock is not acquired while holding tree.lock is From patchwork Fri Oct 18 06:41:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Sridhar, Kanchana P" X-Patchwork-Id: 13841239 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EEBE9D3C550 for ; Fri, 18 Oct 2024 06:41:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7EA866B009D; Fri, 18 Oct 2024 02:41:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 74B0A6B009E; Fri, 18 Oct 2024 02:41:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 500F46B00A1; Fri, 18 Oct 2024 02:41:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 29A106B009D for ; Fri, 18 Oct 2024 02:41:16 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id F0F901209F0 for ; Fri, 18 Oct 2024 06:41:04 +0000 (UTC) X-FDA: 82685775732.06.73EDFB6 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) by imf27.hostedemail.com (Postfix) with ESMTP id A6A2B40005 for ; Fri, 18 Oct 2024 06:41:02 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=k60k+MRu; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf27.hostedemail.com: domain of kanchana.p.sridhar@intel.com designates 192.198.163.15 as permitted sender) smtp.mailfrom=kanchana.p.sridhar@intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729233554; a=rsa-sha256; cv=none; b=5EISsN2g3ldfT1RfvpFfrGA23RIBbk23hwHTDWsbtm8Yk/7hrQIAufNSp7wwnLSENccKIL l86IRAxmtbcAg9Q3KbL7m+cJFXpdrFfH4l/H5ScOcS4CRXh2KMyqtHOPV2H+T3R3OzeYXY ysmiNhSNisKDKnY5SmeVtVm5uGkBBwU= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=k60k+MRu; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf27.hostedemail.com: domain of kanchana.p.sridhar@intel.com designates 192.198.163.15 as permitted sender) smtp.mailfrom=kanchana.p.sridhar@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729233554; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1bCS/ulDA8O8zLi2CbuDrCmwBiuAPux9DDHkPlyIksA=; b=JW6I6iwZDtXd+OScQm/67xWn4sUtE/7bqcW3t/jBuBR8J4gcwsTHDo1wcRkNWPm5Spmbnu RrOEwpc31uqE7W7s1Lc+0nHLeXKUIA8sd54WL5+URAesw9uRiC1ZHPJ3QBmEs1FdfeOdBS ymCV4OEbHm4vGNBJW8raRo4ha9byDqE= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1729233673; x=1760769673; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=O8jxodCGMWtrByAFoEsRBLTTLZfK8XqMptqufBIfhNw=; b=k60k+MRuZ7rIda8ghpUOzCaLmZ+hepK1mnQBVBiHzrbZPE1UM5lcn109 phzGrAe6BCSPmSa3QPB6QA8SaRaMDnzc0Q8gKcJ3LYTu0NLtOKMDBKP9v 62B3ZGcEK1hUG8R5rUtMNpnaqVBwG/DiDQO/VsSmuPpEgEcqLJZr+CWzq yfKmU4Iz1lQzWUycXK00oKftq0OA3VkFs7C4qIrhGp0Fmc1xTY0jbHGtn /9dmUQbqF9DXxXead5R5iolHrFw3M6EooCIVUGlL/6pUrIDAR2N4DsoRQ qKx50DeZYGNEB0ga/jdytlAJe22/hqEBKr4o9esxf+f63Z1wgToO5CgKR Q==; X-CSE-ConnectionGUID: FkeAQn4yS4yJ04Bx/BGCiQ== X-CSE-MsgGUID: IcKOgFxvRviXlztfAIyAsA== X-IronPort-AV: E=McAfee;i="6700,10204,11228"; a="28884961" X-IronPort-AV: E=Sophos;i="6.11,212,1725346800"; d="scan'208";a="28884961" Received: from orviesa003.jf.intel.com ([10.64.159.143]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Oct 2024 23:41:04 -0700 X-CSE-ConnectionGUID: +AFCy/XaR8CvOJBAh1gLlQ== X-CSE-MsgGUID: WK41Ugd7SZ2i1qeniXmNog== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.11,199,1725346800"; d="scan'208";a="83607538" Received: from jf5300-b11a338t.jf.intel.com ([10.242.51.6]) by orviesa003.jf.intel.com with ESMTP; 17 Oct 2024 23:41:04 -0700 From: Kanchana P Sridhar To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, hannes@cmpxchg.org, yosryahmed@google.com, nphamcs@gmail.com, chengming.zhou@linux.dev, usamaarif642@gmail.com, ryan.roberts@arm.com, ying.huang@intel.com, 21cnbao@gmail.com, akpm@linux-foundation.org, linux-crypto@vger.kernel.org, herbert@gondor.apana.org.au, davem@davemloft.net, clabbe@baylibre.com, ardb@kernel.org, ebiggers@google.com, surenb@google.com, kristen.c.accardi@intel.com, zanussi@kernel.org, viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz, mcgrof@kernel.org, kees@kernel.org, joel.granados@kernel.org, bfoster@redhat.com, willy@infradead.org, linux-fsdevel@vger.kernel.org Cc: wajdi.k.feghali@intel.com, vinodh.gopal@intel.com, kanchana.p.sridhar@intel.com Subject: [RFC PATCH v1 12/13] mm: zswap: Compress batching with Intel IAA in zswap_store() of large folios. Date: Thu, 17 Oct 2024 23:41:00 -0700 Message-Id: <20241018064101.336232-13-kanchana.p.sridhar@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20241018064101.336232-1-kanchana.p.sridhar@intel.com> References: <20241018064101.336232-1-kanchana.p.sridhar@intel.com> MIME-Version: 1.0 X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: A6A2B40005 X-Stat-Signature: 8gbmx59hcc8kp7177gd355o57ybzptuh X-Rspam-User: X-HE-Tag: 1729233662-462765 X-HE-Meta: U2FsdGVkX1+KNg3RtR9vHdToNJFuJfDgW4FNd9oQUb/loh2F4hgcMz8OOlovRTJsmylYlufj1zlRV865HVS0qas4YE0Sknkj07ktY9xeOkYEGr7A2z73dt749fXWVCzZGovcZnzRY6BoyFJhjm59O3c0qahh1okW2J/V+7Gh1xJo2UgmDMHObk24J85XH3W76De6F6MPCOcE6r9SdwUhNtRZPILD6hMyoorl700BPKc08mBeJWbtSpGqH9zwToFhIhSMrZ62aiahsd0X9shlSiLUPPvkpqOgmQaXLbxYdNj1HIaHzEqu8/6jx76Cu9NATLIwZPQlHTdb/4+R20WWV2oNJNY1d2A+olClrcr7RVFSBDc/tNYaRbptyNv+AqFK5f76IMHWa8oQMRVpPQVBZKXkdgu3O2+1762CXgTajlCe84W16TrvGWls54dFzTQMcApn8zkFxcE+HhG0OirxO3Ub0A9DNzYHwYTWcC0Ona9WJ0Al6hOZVeXSEKbZEWd9clsODf9oU5LyPnsb1iBJ9f8VGustYoWYp47CboFATl52J7McQXoWHPF2nyDWGsOnV/RQxNIx0gPEb0pm6+G6xyV7CRX1JGByh21bFQRuY94vQYHM4fFnbWN0UJt/LcC/MUle5etIW1tZr6UnRiNrRLKp9e0EuvgMxN+V93uJQ15lADUVQv3BCZ3iP0jEQkoHlNwrRxVZ6fuOgexwXt7CAK5EKgzslTRLR5QHlLXrsEOWH6rZei1BP8+LhQtiO9q+o49xRwHFXKtCOO0FhQAs31ApTHVBWAQAI81GAhr+Mdz5VbUGT75Z8M5/TcnIubRHd8qdaiW2d+BYGEvKpp87xwIoKBexjcjO/fcPccEP7G6bTqhEqCTt+EGgm1f8tgOzZBzUVspQRZ68xNlrdqfKU7pZCy1xmPHV2Tcui5cEXLy7Noqm73rOOzPqwdrmdE3PJH/jhxH9jiGol3psfeL shYjjBDU 7iIKV3yXzdGK7TDxGWP2Ms63GeGze/qfeMcdZsXda+LP2tlBZrLAWoZv8f/tMIAJpi5ErIkC9dRzYk2We42NJtUXvIuMz3GUqmC9zEVjh88KA9cRFm/x//5iuI67TYoNaBKW80VDgu4w4ZuZUdWhyXQ4IhaGSe4H4k9om3ijkwlC3leguADJtM/A8j1xIPRYnOnBtI5C5ArZVyuTqk+6Edh5P++daE08rta6r X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: If the system has Intel IAA, and if CONFIG_ZSWAP_STORE_BATCHING_ENABLED is set to "y", zswap_store() will call swap_crypto_acomp_compress_batch() to batch compress up to SWAP_CRYPTO_SUB_BATCH_SIZE pages in large folios in parallel using the multiple compress engines available in IAA hardware. On platforms with multiple IAA devices per socket, compress jobs from all cores in a socket will be distributed among all IAA devices on the socket by the iaa_crypto driver. If zswap_store() is called with a large folio, and if zswap_store_batching_enabled() returns "true", it will call the main __zswap_store_batch_core() interface for compress batching. The interface represents the extensible compress batching architecture that can potentially be called with a batch of any-order folios from shrink_folio_list(). In other words, although zswap_store() calls __zswap_store_batch_core() with exactly one large folio in this patch, we will reuse this API to reclaim a batch of folios in subsequent patches. The newly added functions that implement batched stores follow the general structure of zswap_store() of a large folio. Some amount of restructuring and optimization is done to minimize failure points for a batch, fail early and maximize the zswap store pipeline occupancy with SWAP_CRYPTO_SUB_BATCH_SIZE pages, potentially from multiple folios. This is intended to maximize reclaim throughput with the IAA hardware parallel compressions. Signed-off-by: Kanchana P Sridhar --- include/linux/zswap.h | 84 ++++++ mm/zswap.c | 591 +++++++++++++++++++++++++++++++++++++++++- 2 files changed, 671 insertions(+), 4 deletions(-) diff --git a/include/linux/zswap.h b/include/linux/zswap.h index 74ad2a24b309..9bbe330686f6 100644 --- a/include/linux/zswap.h +++ b/include/linux/zswap.h @@ -24,6 +24,88 @@ struct zswap_lruvec_state { atomic_long_t nr_disk_swapins; }; +/* + * struct zswap_store_sub_batch_page: + * + * This represents one "zswap batching element", namely, the + * attributes associated with a page in a large folio that will + * be compressed and stored in zswap. The term "batch" is reserved + * for a conceptual "batch" of folios that can be sent to + * zswap_store() by reclaim. The term "sub-batch" is used to describe + * a collection of "zswap batching elements", i.e., an array of + * "struct zswap_store_sub_batch_page *". + * + * The zswap compress sub-batch size is specified by + * SWAP_CRYPTO_SUB_BATCH_SIZE, currently set as 8UL if the + * platform has Intel IAA. This means zswap can store a large folio + * by creating sub-batches of up to 8 pages and compressing this + * batch using IAA to parallelize the 8 compress jobs in hardware. + * For e.g., a 64KB folio can be compressed as 2 sub-batches of + * 8 pages each. This can significantly improve the zswap_store() + * performance for large folios. + * + * Although the page itself is represented directly, the structure + * adds a "u8 batch_idx" to represent an index for the folio in a + * conceptual "batch of folios" that can be passed to zswap_store(). + * Conceptually, this allows for up to 256 folios that can be passed + * to zswap_store(). If this conceptual number of folios sent to + * zswap_store() exceeds 256, the "batch_idx" needs to become u16. + */ +struct zswap_store_sub_batch_page { + u8 batch_idx; + swp_entry_t swpentry; + struct obj_cgroup *objcg; + struct zswap_entry *entry; + int error; /* folio error status. */ +}; + +/* + * struct zswap_store_pipeline_state: + * + * This stores state during IAA compress batching of (conceptually, a batch of) + * folios. The term pipelining in this context, refers to breaking down + * the batch of folios being reclaimed into sub-batches of + * SWAP_CRYPTO_SUB_BATCH_SIZE pages, batch compressing and storing the + * sub-batch. This concept could be further evolved to use overlap of CPU + * computes with IAA computes. For instance, we could stage the post-compress + * computes for sub-batch "N-1" to happen in parallel with IAA batch + * compression of sub-batch "N". + * + * We begin by developing the concept of compress batching. Pipelining with + * overlap can be future work. + * + * @errors: The errors status for the batch of reclaim folios passed in from + * a higher mm layer such as swap_writepage(). + * @pool: A valid zswap_pool. + * @acomp_ctx: The per-cpu pointer to the crypto_acomp_ctx for the @pool. + * @sub_batch: This is an array that represents the sub-batch of up to + * SWAP_CRYPTO_SUB_BATCH_SIZE pages that are being stored + * in zswap. + * @comp_dsts: The destination buffers for crypto_acomp_compress() for each + * page being compressed. + * @comp_dlens: The destination buffers' lengths from crypto_acomp_compress() + * obtained after crypto_acomp_poll() returns completion status, + * for each page being compressed. + * @comp_errors: Compression errors for each page being compressed. + * @nr_comp_pages: Total number of pages in @sub_batch. + * + * Note: + * The max sub-batch size is SWAP_CRYPTO_SUB_BATCH_SIZE, currently 8UL. + * Hence, if SWAP_CRYPTO_SUB_BATCH_SIZE exceeds 256, some of the + * u8 members (except @comp_dsts) need to become u16. + */ +struct zswap_store_pipeline_state { + int *errors; + struct zswap_pool *pool; + struct crypto_acomp_ctx *acomp_ctx; + struct zswap_store_sub_batch_page *sub_batch; + struct page **comp_pages; + u8 **comp_dsts; + unsigned int *comp_dlens; + int *comp_errors; + u8 nr_comp_pages; +}; + bool zswap_store_batching_enabled(void); unsigned long zswap_total_pages(void); bool zswap_store(struct folio *folio); @@ -39,6 +121,8 @@ bool zswap_never_enabled(void); #else struct zswap_lruvec_state {}; +struct zswap_store_sub_batch_page {}; +struct zswap_store_pipeline_state {}; static inline bool zswap_store_batching_enabled(void) { diff --git a/mm/zswap.c b/mm/zswap.c index cab3114321f9..1c12a7b9f4ff 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -130,7 +130,7 @@ module_param_named(shrinker_enabled, zswap_shrinker_enabled, bool, 0644); /* * Enable/disable batching of compressions if zswap_store is called with a * large folio. If enabled, and if IAA is the zswap compressor, pages are - * compressed in parallel in batches of say, 8 pages. + * compressed in parallel in batches of SWAP_CRYPTO_SUB_BATCH_SIZE pages. * If not, every page is compressed sequentially. */ static bool __zswap_store_batching_enabled = IS_ENABLED( @@ -246,6 +246,12 @@ __always_inline bool zswap_store_batching_enabled(void) return __zswap_store_batching_enabled; } +static void __zswap_store_batch_core( + int node_id, + struct folio **folios, + int *errors, + unsigned int nr_folios); + /********************************* * pool functions **********************************/ @@ -906,6 +912,9 @@ static int zswap_cpu_comp_dead(unsigned int cpu, struct hlist_node *node) return 0; } +/* + * The acomp_ctx->mutex must be locked/unlocked in the calling procedure. + */ static bool zswap_compress(struct page *page, struct zswap_entry *entry, struct zswap_pool *pool) { @@ -921,8 +930,6 @@ static bool zswap_compress(struct page *page, struct zswap_entry *entry, acomp_ctx = raw_cpu_ptr(pool->acomp_ctx); - mutex_lock(&acomp_ctx->mutex); - dst = acomp_ctx->buffer[0]; sg_init_table(&input, 1); sg_set_page(&input, page, PAGE_SIZE, 0); @@ -992,7 +999,6 @@ static bool zswap_compress(struct page *page, struct zswap_entry *entry, else if (alloc_ret) zswap_reject_alloc_fail++; - mutex_unlock(&acomp_ctx->mutex); return comp_ret == 0 && alloc_ret == 0; } @@ -1545,10 +1551,17 @@ static ssize_t zswap_store_page(struct page *page, return -EINVAL; } +/* + * Modified to use the IAA compress batching framework implemented in + * __zswap_store_batch_core() if zswap_store_batching_enabled() is true. + * The batching code is intended to significantly improve folio store + * performance over the sequential code. + */ bool zswap_store(struct folio *folio) { long nr_pages = folio_nr_pages(folio); swp_entry_t swp = folio->swap; + struct crypto_acomp_ctx *acomp_ctx; struct obj_cgroup *objcg = NULL; struct mem_cgroup *memcg = NULL; struct zswap_pool *pool; @@ -1556,6 +1569,17 @@ bool zswap_store(struct folio *folio) bool ret = false; long index; + /* + * Improve large folio zswap_store() latency with IAA compress batching. + */ + if (folio_test_large(folio) && zswap_store_batching_enabled()) { + int error = -1; + __zswap_store_batch_core(folio_nid(folio), &folio, &error, 1); + if (!error) + ret = true; + return ret; + } + VM_WARN_ON_ONCE(!folio_test_locked(folio)); VM_WARN_ON_ONCE(!folio_test_swapcache(folio)); @@ -1588,6 +1612,9 @@ bool zswap_store(struct folio *folio) mem_cgroup_put(memcg); } + acomp_ctx = raw_cpu_ptr(pool->acomp_ctx); + mutex_lock(&acomp_ctx->mutex); + for (index = 0; index < nr_pages; ++index) { struct page *page = folio_page(folio, index); ssize_t bytes; @@ -1609,6 +1636,7 @@ bool zswap_store(struct folio *folio) ret = true; put_pool: + mutex_unlock(&acomp_ctx->mutex); zswap_pool_put(pool); put_objcg: obj_cgroup_put(objcg); @@ -1638,6 +1666,561 @@ bool zswap_store(struct folio *folio) return ret; } +/* + * Note: If SWAP_CRYPTO_SUB_BATCH_SIZE exceeds 256, change the + * u8 stack variables in the next several functions, to u16. + */ + +/* + * Propagate the "sbp" error condition to other batch elements belonging to + * the same folio as "sbp". + */ +static __always_inline void zswap_store_propagate_errors( + struct zswap_store_pipeline_state *zst, + u8 error_batch_idx) +{ + u8 i; + + if (zst->errors[error_batch_idx]) + return; + + for (i = 0; i < zst->nr_comp_pages; ++i) { + struct zswap_store_sub_batch_page *sbp = &zst->sub_batch[i]; + + if (sbp->batch_idx == error_batch_idx) { + if (!sbp->error) { + if (!IS_ERR_VALUE(sbp->entry->handle)) + zpool_free(zst->pool->zpool, sbp->entry->handle); + + if (sbp->entry) { + zswap_entry_cache_free(sbp->entry); + sbp->entry = NULL; + } + sbp->error = -EINVAL; + } + } + } + + /* + * Set zswap status for the folio to "error" + * for use in swap_writepage. + */ + zst->errors[error_batch_idx] = -EINVAL; +} + +static __always_inline void zswap_process_comp_errors( + struct zswap_store_pipeline_state *zst) +{ + u8 i; + + for (i = 0; i < zst->nr_comp_pages; ++i) { + struct zswap_store_sub_batch_page *sbp = &zst->sub_batch[i]; + + if (zst->comp_errors[i]) { + if (zst->comp_errors[i] == -ENOSPC) + zswap_reject_compress_poor++; + else + zswap_reject_compress_fail++; + + if (!sbp->error) + zswap_store_propagate_errors(zst, + sbp->batch_idx); + } + } +} + +static void zswap_compress_batch(struct zswap_store_pipeline_state *zst) +{ + /* + * Compress up to SWAP_CRYPTO_SUB_BATCH_SIZE pages. + * If IAA is the zswap compressor, this compresses the + * pages in parallel, leading to significant performance + * improvements as compared to software compressors. + */ + swap_crypto_acomp_compress_batch( + zst->comp_pages, + zst->comp_dsts, + zst->comp_dlens, + zst->comp_errors, + zst->nr_comp_pages, + zst->acomp_ctx); + + /* + * Scan the sub-batch for any compression errors, + * and invalidate pages with errors, along with other + * pages belonging to the same folio as the error pages. + */ + zswap_process_comp_errors(zst); +} + +static void zswap_zpool_store_sub_batch( + struct zswap_store_pipeline_state *zst) +{ + u8 i; + + for (i = 0; i < zst->nr_comp_pages; ++i) { + struct zswap_store_sub_batch_page *sbp = &zst->sub_batch[i]; + struct zpool *zpool; + unsigned long handle; + char *buf; + gfp_t gfp; + int err; + + /* Skip pages that had compress errors. */ + if (sbp->error) + continue; + + zpool = zst->pool->zpool; + gfp = __GFP_NORETRY | __GFP_NOWARN | __GFP_KSWAPD_RECLAIM; + if (zpool_malloc_support_movable(zpool)) + gfp |= __GFP_HIGHMEM | __GFP_MOVABLE; + err = zpool_malloc(zpool, zst->comp_dlens[i], gfp, &handle); + + if (err) { + if (err == -ENOSPC) + zswap_reject_compress_poor++; + else + zswap_reject_alloc_fail++; + + /* + * An error should be propagated to other pages of the + * same folio in the sub-batch, and zpool resources for + * those pages (in sub-batch order prior to this zpool + * error) should be de-allocated. + */ + zswap_store_propagate_errors(zst, sbp->batch_idx); + continue; + } + + buf = zpool_map_handle(zpool, handle, ZPOOL_MM_WO); + memcpy(buf, zst->comp_dsts[i], zst->comp_dlens[i]); + zpool_unmap_handle(zpool, handle); + + sbp->entry->handle = handle; + sbp->entry->length = zst->comp_dlens[i]; + } +} + +/* + * Returns true if the entry was successfully + * stored in the xarray, and false otherwise. + */ +static bool zswap_store_entry(swp_entry_t page_swpentry, + struct zswap_entry *entry) +{ + struct zswap_entry *old = xa_store(swap_zswap_tree(page_swpentry), + swp_offset(page_swpentry), + entry, GFP_KERNEL); + if (xa_is_err(old)) { + int err = xa_err(old); + + WARN_ONCE(err != -ENOMEM, "unexpected xarray error: %d\n", err); + zswap_reject_alloc_fail++; + return false; + } + + /* + * We may have had an existing entry that became stale when + * the folio was redirtied and now the new version is being + * swapped out. Get rid of the old. + */ + if (old) + zswap_entry_free(old); + + return true; +} + +static void zswap_batch_compress_post_proc( + struct zswap_store_pipeline_state *zst) +{ + int nr_objcg_pages = 0, nr_pages = 0; + struct obj_cgroup *objcg = NULL; + size_t compressed_bytes = 0; + u8 i; + + zswap_zpool_store_sub_batch(zst); + + for (i = 0; i < zst->nr_comp_pages; ++i) { + struct zswap_store_sub_batch_page *sbp = &zst->sub_batch[i]; + + if (sbp->error) + continue; + + if (!zswap_store_entry(sbp->swpentry, sbp->entry)) { + zswap_store_propagate_errors(zst, sbp->batch_idx); + continue; + } + + /* + * The entry is successfully compressed and stored in the tree, + * there is no further possibility of failure. Grab refs to the + * pool and objcg. These refs will be dropped by + * zswap_entry_free() when the entry is removed from the tree. + */ + zswap_pool_get(zst->pool); + if (sbp->objcg) + obj_cgroup_get(sbp->objcg); + + /* + * We finish initializing the entry while it's already in xarray. + * This is safe because: + * + * 1. Concurrent stores and invalidations are excluded by folio + * lock. + * + * 2. Writeback is excluded by the entry not being on the LRU yet. + * The publishing order matters to prevent writeback from seeing + * an incoherent entry. + */ + sbp->entry->pool = zst->pool; + sbp->entry->swpentry = sbp->swpentry; + sbp->entry->objcg = sbp->objcg; + sbp->entry->referenced = true; + if (sbp->entry->length) { + INIT_LIST_HEAD(&sbp->entry->lru); + zswap_lru_add(&zswap_list_lru, sbp->entry); + } + + if (!objcg && sbp->objcg) { + objcg = sbp->objcg; + } else if (objcg && sbp->objcg && (objcg != sbp->objcg)) { + obj_cgroup_charge_zswap(objcg, compressed_bytes); + count_objcg_events(objcg, ZSWPOUT, nr_objcg_pages); + compressed_bytes = 0; + nr_objcg_pages = 0; + objcg = sbp->objcg; + } + + if (sbp->objcg) { + compressed_bytes += sbp->entry->length; + ++nr_objcg_pages; + } + + ++nr_pages; + } /* for sub-batch pages. */ + + if (objcg) { + obj_cgroup_charge_zswap(objcg, compressed_bytes); + count_objcg_events(objcg, ZSWPOUT, nr_objcg_pages); + } + + atomic_long_add(nr_pages, &zswap_stored_pages); + count_vm_events(ZSWPOUT, nr_pages); +} + +static void zswap_store_sub_batch(struct zswap_store_pipeline_state *zst) +{ + u8 i; + + for (i = 0; i < zst->nr_comp_pages; ++i) { + zst->comp_dsts[i] = zst->acomp_ctx->buffer[i]; + zst->comp_dlens[i] = PAGE_SIZE; + } /* for sub-batch pages. */ + + /* + * Batch compress sub-batch "N". If IAA is the compressor, the + * hardware will compress multiple pages in parallel. + */ + zswap_compress_batch(zst); + + zswap_batch_compress_post_proc(zst); +} + +static void zswap_add_folio_pages_to_sb( + struct zswap_store_pipeline_state *zst, + struct folio* folio, + u8 batch_idx, + struct obj_cgroup *objcg, + struct zswap_entry *entries[], + long start_idx, + u8 add_nr_pages) +{ + long index; + + for (index = start_idx; index < (start_idx + add_nr_pages); ++index) { + u8 i = zst->nr_comp_pages; + struct zswap_store_sub_batch_page *sbp = &zst->sub_batch[i]; + struct page *page = folio_page(folio, index); + zst->comp_pages[i] = page; + sbp->swpentry = page_swap_entry(page); + sbp->batch_idx = batch_idx; + sbp->objcg = objcg; + sbp->entry = entries[index - start_idx]; + sbp->error = 0; + ++zst->nr_comp_pages; + } +} + +static __always_inline void zswap_store_reset_sub_batch( + struct zswap_store_pipeline_state *zst) +{ + zst->nr_comp_pages = 0; +} + +/* Allocate entries for the next sub-batch. */ +static int zswap_alloc_entries(u8 nr_entries, + struct zswap_entry *entries[], + int node_id) +{ + u8 i; + + for (i = 0; i < nr_entries; ++i) { + entries[i] = zswap_entry_cache_alloc(GFP_KERNEL, node_id); + if (!entries[i]) { + u8 j; + + zswap_reject_kmemcache_fail++; + for (j = 0; j < i; ++j) + zswap_entry_cache_free(entries[j]); + return -EINVAL; + } + + entries[i]->handle = (unsigned long)ERR_PTR(-EINVAL); + } + + return 0; +} + +/* + * If the zswap store fails or zswap is disabled, we must invalidate + * the possibly stale entries which were previously stored at the + * offsets corresponding to each page of the folio. Otherwise, + * writeback could overwrite the new data in the swapfile. + */ +static void zswap_delete_stored_entries(struct folio *folio) +{ + swp_entry_t swp = folio->swap; + unsigned type = swp_type(swp); + pgoff_t offset = swp_offset(swp); + struct zswap_entry *entry; + struct xarray *tree; + long index; + + for (index = 0; index < folio_nr_pages(folio); ++index) { + tree = swap_zswap_tree(swp_entry(type, offset + index)); + entry = xa_erase(tree, offset + index); + if (entry) + zswap_entry_free(entry); + } +} + +static void zswap_store_process_folio_errors( + struct folio **folios, + int *errors, + unsigned int nr_folios) +{ + u8 batch_idx; + + for (batch_idx = 0; batch_idx < nr_folios; ++batch_idx) + if (errors[batch_idx]) + zswap_delete_stored_entries(folios[batch_idx]); +} + +/* + * Store a (batch of) any-order large folio(s) in zswap. Each folio will be + * broken into sub-batches of SWAP_CRYPTO_SUB_BATCH_SIZE pages, the + * sub-batch will be compressed by IAA in parallel, and stored in zpool/xarray. + * + * This the main procedure for batching of folios, and batching within + * large folios. + * + * This procedure should only be called if zswap supports batching of stores. + * Otherwise, the sequential implementation for storing folios as in the + * current zswap_store() should be used. + * + * The signature of this procedure is meant to allow the calling function, + * (for instance, swap_writepage()) to pass an array @folios + * (the "reclaim batch") of @nr_folios folios to be stored in zswap. + * All folios in the batch must have the same swap type and folio_nid @node_id + * (simplifying assumptions only to manage code complexity). + * + * @errors and @folios have @nr_folios number of entries, with one-one + * correspondence (@errors[i] represents the error status of @folios[i], + * for i in @nr_folios). + * The calling function (for instance, swap_writepage()) should initialize + * @errors[i] to a non-0 value. + * If zswap successfully stores @folios[i], it will set @errors[i] to 0. + * If there is an error in zswap, it will set @errors[i] to -EINVAL. + */ +static void __zswap_store_batch_core( + int node_id, + struct folio **folios, + int *errors, + unsigned int nr_folios) +{ + struct zswap_store_sub_batch_page sub_batch[SWAP_CRYPTO_SUB_BATCH_SIZE]; + struct page *comp_pages[SWAP_CRYPTO_SUB_BATCH_SIZE]; + u8 *comp_dsts[SWAP_CRYPTO_SUB_BATCH_SIZE] = { NULL }; + unsigned int comp_dlens[SWAP_CRYPTO_SUB_BATCH_SIZE]; + int comp_errors[SWAP_CRYPTO_SUB_BATCH_SIZE]; + struct crypto_acomp_ctx *acomp_ctx; + struct zswap_pool *pool; + /* + * For now, lets say a max of 256 large folios can be reclaimed + * at a time, as a batch. If this exceeds 256, change this to u16. + */ + u8 batch_idx; + + /* Initialize the compress batching pipeline state. */ + struct zswap_store_pipeline_state zst = { + .errors = errors, + .pool = NULL, + .acomp_ctx = NULL, + .sub_batch = sub_batch, + .comp_pages = comp_pages, + .comp_dsts = comp_dsts, + .comp_dlens = comp_dlens, + .comp_errors = comp_errors, + .nr_comp_pages = 0, + }; + + pool = zswap_pool_current_get(); + if (!pool) { + if (zswap_check_limits()) + queue_work(shrink_wq, &zswap_shrink_work); + goto check_old; + } + + acomp_ctx = raw_cpu_ptr(pool->acomp_ctx); + mutex_lock(&acomp_ctx->mutex); + zst.pool = pool; + zst.acomp_ctx = acomp_ctx; + + /* + * Iterate over the folios passed in. Construct sub-batches of up to + * SWAP_CRYPTO_SUB_BATCH_SIZE pages, if necessary, by iterating through + * multiple folios from the input "folios". Process each sub-batch + * with IAA batch compression. Detect errors from batch compression + * and set the impacted folio's error status (this happens in + * zswap_store_process_errors()). + */ + for (batch_idx = 0; batch_idx < nr_folios; ++batch_idx) { + struct folio *folio = folios[batch_idx]; + BUG_ON(!folio); + long folio_start_idx, nr_pages = folio_nr_pages(folio); + struct zswap_entry *entries[SWAP_CRYPTO_SUB_BATCH_SIZE]; + struct obj_cgroup *objcg = NULL; + struct mem_cgroup *memcg = NULL; + + VM_WARN_ON_ONCE(!folio_test_locked(folio)); + VM_WARN_ON_ONCE(!folio_test_swapcache(folio)); + + /* + * If zswap is disabled, we must invalidate the possibly stale entry + * which was previously stored at this offset. Otherwise, writeback + * could overwrite the new data in the swapfile. + */ + if (!zswap_enabled) + continue; + + /* Check cgroup limits */ + objcg = get_obj_cgroup_from_folio(folio); + if (objcg && !obj_cgroup_may_zswap(objcg)) { + memcg = get_mem_cgroup_from_objcg(objcg); + if (shrink_memcg(memcg)) { + mem_cgroup_put(memcg); + goto put_objcg; + } + mem_cgroup_put(memcg); + } + + if (zswap_check_limits()) + goto put_objcg; + + if (objcg) { + memcg = get_mem_cgroup_from_objcg(objcg); + if (memcg_list_lru_alloc(memcg, &zswap_list_lru, GFP_KERNEL)) { + mem_cgroup_put(memcg); + goto put_objcg; + } + mem_cgroup_put(memcg); + } + + /* + * By default, set zswap status to "success" for use in + * swap_writepage() when this returns. In case of errors, + * a negative error number will over-write this when + * zswap_store_process_errors() is called. + */ + errors[batch_idx] = 0; + + folio_start_idx = 0; + + while (nr_pages > 0) { + u8 add_nr_pages; + + /* + * If we have accumulated SWAP_CRYPTO_SUB_BATCH_SIZE + * pages, process the sub-batch: it could contain pages + * from multiple folios. + */ + if (zst.nr_comp_pages == SWAP_CRYPTO_SUB_BATCH_SIZE) { + zswap_store_sub_batch(&zst); + zswap_store_reset_sub_batch(&zst); + /* + * Stop processing this folio if it had + * compress errors. + */ + if (errors[batch_idx]) + goto put_objcg; + } + + add_nr_pages = min3(( + (long)SWAP_CRYPTO_SUB_BATCH_SIZE - + (long)zst.nr_comp_pages), + nr_pages, + (long)SWAP_CRYPTO_SUB_BATCH_SIZE); + + /* + * Allocate zswap_entries for this sub-batch. If we + * get errors while doing so, we can flag an error + * for the folio, call the shrinker and move on. + */ + if (zswap_alloc_entries(add_nr_pages, + entries, node_id)) { + zswap_store_reset_sub_batch(&zst); + errors[batch_idx] = -EINVAL; + goto put_objcg; + } + + zswap_add_folio_pages_to_sb( + &zst, + folio, + batch_idx, + objcg, + entries, + folio_start_idx, + add_nr_pages); + + nr_pages -= add_nr_pages; + folio_start_idx += add_nr_pages; + } /* this folio has pages to be compressed. */ + + obj_cgroup_put(objcg); + continue; + +put_objcg: + obj_cgroup_put(objcg); + if (zswap_pool_reached_full) + queue_work(shrink_wq, &zswap_shrink_work); + } /* for batch folios */ + + if (!zswap_enabled) + goto check_old; + + /* + * Process last sub-batch: it could contain pages from + * multiple folios. + */ + if (zst.nr_comp_pages) + zswap_store_sub_batch(&zst); + + mutex_unlock(&acomp_ctx->mutex); + zswap_pool_put(pool); +check_old: + zswap_store_process_folio_errors(folios, errors, nr_folios); +} + bool zswap_load(struct folio *folio) { swp_entry_t swp = folio->swap; From patchwork Fri Oct 18 06:41:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Sridhar, Kanchana P" X-Patchwork-Id: 13841240 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9BFEED3C550 for ; Fri, 18 Oct 2024 06:41:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AA4686B009E; Fri, 18 Oct 2024 02:41:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9D64A6B00A3; Fri, 18 Oct 2024 02:41:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5C5536B00A0; Fri, 18 Oct 2024 02:41:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 2A3426B009E for ; Fri, 18 Oct 2024 02:41:16 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id BCBB114059D for ; Fri, 18 Oct 2024 06:41:03 +0000 (UTC) X-FDA: 82685775816.24.042F940 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) by imf24.hostedemail.com (Postfix) with ESMTP id 6F867180015 for ; Fri, 18 Oct 2024 06:41:11 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=Q+hJXlnL; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf24.hostedemail.com: domain of kanchana.p.sridhar@intel.com designates 192.198.163.15 as permitted sender) smtp.mailfrom=kanchana.p.sridhar@intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729233566; a=rsa-sha256; cv=none; b=LqztCiVyKT3MWhOZOyW4HuAmqiSnwBQg6Lq438e8PP1dMFuC6vL79SzVSRT5enZS1/kC21 0QDbCXPwgYLE+GOnnLpyNjoFV4wKkeLWVekHcGv4hsdSFKNgPQ7NPttjsgTnu3raiVkR/P KE8lGSy3wSC4RSpuSz68y0gJrz4Eg2M= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=Q+hJXlnL; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf24.hostedemail.com: domain of kanchana.p.sridhar@intel.com designates 192.198.163.15 as permitted sender) smtp.mailfrom=kanchana.p.sridhar@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729233566; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=HOT1q7KKEg3hAwUZgZHqz7dL5FkeOud9qtqK/CDHBno=; b=Y3TXW9jwYNKNO/GIIltIHdwIuy5sLosiBUtDvUPufxHD3gZlhOGYNGrGEDSsLVcNM3Caip d6sGuzDPbR0oMuquiBv49P5BXnAHHuTQDV0Sea8gUODun9bIzxuaMBcymXWetAobZFnwgK 4/wlCtfnX98kweGTN51lsf9U8QpGXgI= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1729233673; x=1760769673; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=BuyLq29pHWHuKPa34/0++cLrRWSOS8zwsGjaE7bn5mU=; b=Q+hJXlnLOCSBZOxUbOiiSNPNGieH/yDSUh4XVgsNCoz0WKXr5YVWAjwD HupOSg0Ko8qdp2cy2GwOmohAnOCgFLBC95GUgrPX8WVMZ4Gyie2i0eYm6 HPWZI98J/xcS9lBaEq/VxPU+CrYjLl4Em7/3WkeWUu0MQqiv246AFnTgv Aqs2yG+z+SIK025zElBYeRNtUCnu3jR9mXPfW8zOdkM2sqb9svwwX33T2 62MfZqNQhNxXVHWdqEP9vdlB3GjbxR49ReWq2+6gdGR6pbYYJpXiAa+yo eJcvZcZu+UsOI7jzn+w2/yVE4CZNlGxVU93gzkb+b+4s7DoZMcDVKlReZ w==; X-CSE-ConnectionGUID: gzoC7/6jTCCSPho/vgy21g== X-CSE-MsgGUID: J0jTrk2JRGydncRSZMYVEg== X-IronPort-AV: E=McAfee;i="6700,10204,11228"; a="28884980" X-IronPort-AV: E=Sophos;i="6.11,212,1725346800"; d="scan'208";a="28884980" Received: from orviesa003.jf.intel.com ([10.64.159.143]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Oct 2024 23:41:05 -0700 X-CSE-ConnectionGUID: SVTtz1sKTHaDZJhI1KvKbg== X-CSE-MsgGUID: UQHOLBEfRj29Nglbp+9Wnw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.11,199,1725346800"; d="scan'208";a="83607542" Received: from jf5300-b11a338t.jf.intel.com ([10.242.51.6]) by orviesa003.jf.intel.com with ESMTP; 17 Oct 2024 23:41:04 -0700 From: Kanchana P Sridhar To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, hannes@cmpxchg.org, yosryahmed@google.com, nphamcs@gmail.com, chengming.zhou@linux.dev, usamaarif642@gmail.com, ryan.roberts@arm.com, ying.huang@intel.com, 21cnbao@gmail.com, akpm@linux-foundation.org, linux-crypto@vger.kernel.org, herbert@gondor.apana.org.au, davem@davemloft.net, clabbe@baylibre.com, ardb@kernel.org, ebiggers@google.com, surenb@google.com, kristen.c.accardi@intel.com, zanussi@kernel.org, viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz, mcgrof@kernel.org, kees@kernel.org, joel.granados@kernel.org, bfoster@redhat.com, willy@infradead.org, linux-fsdevel@vger.kernel.org Cc: wajdi.k.feghali@intel.com, vinodh.gopal@intel.com, kanchana.p.sridhar@intel.com Subject: [RFC PATCH v1 13/13] mm: vmscan, swap, zswap: Compress batching of folios in shrink_folio_list(). Date: Thu, 17 Oct 2024 23:41:01 -0700 Message-Id: <20241018064101.336232-14-kanchana.p.sridhar@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20241018064101.336232-1-kanchana.p.sridhar@intel.com> References: <20241018064101.336232-1-kanchana.p.sridhar@intel.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 6F867180015 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: dfnuu96fkxtyc6wmpa3hcaous44ms5i6 X-HE-Tag: 1729233671-338239 X-HE-Meta: U2FsdGVkX1+eFk7qCHaqA192zl1cO9BrT2hWMwdxD6+DOX4Ez93qtv/ac371QAumkMAk102/+JyRQg78vwr5tUt91oDQoQRPtyiHOSzlQHZMt9r87ROEWZRF1vg84+kAc3PyLjA+H5aqBwdA0+sXLScpJoc8FfxL2gIednUFUd4iDTL22fmRqNAmdNtCpfRGJ59LziuW4IwDCO61hEJBVAAX9MHH7daDoQzG4xFVWfquqfuI/Fx0ACElk/XM8aGxIjJI6gQAhSgwU3KRt9LRTWsUf9KooFQ+r7uFCnCKrzJaxV+wG24ZoQF+mv4b8TFjNlA1HyTUHq/KQaJM5Un1VpNPdl8LVh/5hpat9acelBHqN/icMEPfnw3qatJNf9SbW5L5wVR1zsmuesK9aZzgxMeVR2p22ClxQdrvpkj0xChaBaVOCzVxm+encRZkIduQwhMJWsrML9vaZQbCNa+ve/AE/3q4BfGKNL3qWcKJp6SnUM97JhLuFKCUJkhRtTHWOayFwbouHHemjqQ/SE4/Z92RW5lIhzJbVx3vplLp5NgemC6H9/AzwAQsx6O4kC318KEzVnOlkFrC0M6+CWg6dr8uKoCzNFCXOrjp6cCG5IQQtzt+5uz7wXgcOCJJSbpAlbRyNBj+dLyp6UWNgiBZksY7AwzhX8gtgfo3j9dZXwdLT0nnDvve3z/HwF/z6SjfdBYnbiRr1bo4NuJXeKVPFpNGzeGXzBeo+KCMwIgfSV10P9tRHulm7ogWrCoAa9FrajKxTninIMjXAjUvPstxOB+q0q7E7i9DOmuqz7/ja3zTrxz7lT5KT/RvBEuE4izh97+O4TvM/JINkk1Z2NxKnu5v6hDKrTu93yYX2wk0rkVOCvJEMkkllhKOc6lvbxxhOu+lSaO+C5ehJequ28Z7PRATdS2IfWhBE53r5kk0GH593SaF9kCw8rNCuXtDRhW6GpRxs0+9E/j3G55lOX4 o9qJ1Osr azF+rcJiSuB6zHXfRw6aoxTsUKYzt6yWe+d+33HZCTVFrePyExyBFmsw9Px+4qbM5cgYxSeHiO8xkqGf0QZC3+mW6so33w6zGZ7175T8prTKPOvI8YJrFl9LCA1XLtM3iIj5QUIFXQKk8X15dHw52YqOYXHMY4v8FEoPbQiGikUsfT9CSTOKnCcva3Y+EbRpA+kL+3Z3QQtncNOQ3aRw0E3xRi6GSunZ4+UHxo7oTIrttAWA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This patch enables the use of Intel IAA hardware compression acceleration to reclaim a batch of folios in shrink_folio_list(). This results in reclaim throughput and workload/sys performance improvements. The earlier patches on compress batching deployed multiple IAA compress engines for compressing up to SWAP_CRYPTO_SUB_BATCH_SIZE pages within a large folio that is being stored in zswap_store(). This patch further propagates the efficiency improvements demonstrated with IAA "batching within folios", to vmscan "batching of folios" which will also use batching within folios using the extensible architecture of the __zswap_store_batch_core() procedure added earlier, that accepts an array of folios. A plug mechanism is introduced in swap_writepage() to aggregate a batch of up to vm.compress-batchsize ([1, 32]) folios before processing the plug. The plug will be processed if any of the following is true: 1) The plug has vm.compress-batchsize folios. If the system has Intel IAA, "sysctl vm.compress-batchsize" can be configured to be in [1, 32]. On systems without IAA, or if CONFIG_ZSWAP_STORE_BATCHING_ENABLED is not set, "sysctl vm.compress-batchsize" can only be 1. 2) A folio of a different swap type or folio_nid as the current folios in the plug, needs to be added to the plug. 3) A pmd-mappable folio needs to be swapped out. In this case, the existing folios in the plug are processed. The pmd-mappable folio is swapped out (zswap_store() will batch compress SWAP_CRYPTO_SUB_BATCH_SIZE pages in the pmd-mappable folio if system has IAA) in a batch of its own. From zswap's perspective, it now receives a hybrid batch of any-order (non-pmd-mappable) folios when the plug is processed via zswap_store_batch() that calls __zswap_store_batch_core(). This makes sure that the zswap compress batching pipeline occupancy and reclaim throughput are maximized. The shrink_folio_list() interface with swap_writepage() is modified to work with the plug mechanism. When shrink_folio_list() calls pageout(), it needs to handle new return codes from pageout(), namely, PAGE_BATCHED and PAGE_BATCH_SUCCESS: PAGE_BATCHED: The page is not yet swapped out, so we need to wait for the "imc_plug" batch to be processed, before running the post-pageout computes in shrink_folio_list(). PAGE_BATCH_SUCCESS: When the "imc_plug" is processed in swap_writepage(), a newly added status "AOP_PAGE_BATCH_SUCCESS" is returned to pageout(), which in turn returns PAGE_BATCH_SUCCESS to shrink_folio_list(). Upon receiving PAGE_BATCH_SUCCESS from pageout(), shrink_folio_list() must now serialize and run the post-pageout computes for all the folios in "imc_plug". To summarize this approach, this patch introduces a plug in reclaim that aggregates a batch of folios, parallelizes the zswap store of the folios using IAA hardware acceleration, then returns to run the serialized flow after the "batch pageout". The patch attempts to do this with some minimal/necessary amount of code duplication and by addition of an iteration through the "imc_plug" folios in shrink_folio_list(). I have validated this extensively, and not seen any issues. I would appreciate suggestions to improve upon this approach. Submitting this functionality as a single distinct patch in the RFC patch-series because all the changes in this specific patch are for shrink_folio_list() batching; they wouldn't make sense without the functionality in this patch. Besides the functionality itself, I would also appreciate comments on whether the patch needs to be organized differently. Thanks Ying Huang for suggesting ideas on simplifying the vmscan interface to the swap_writepage() plug mechanism. Suggested-by: Ying Huang Signed-off-by: Kanchana P Sridhar --- include/linux/fs.h | 2 + include/linux/mm.h | 8 ++ include/linux/writeback.h | 5 ++ include/linux/zswap.h | 16 ++++ kernel/sysctl.c | 9 +++ mm/page_io.c | 152 ++++++++++++++++++++++++++++++++++++- mm/swap.c | 15 ++++ mm/swap.h | 40 ++++++++++ mm/vmscan.c | 154 +++++++++++++++++++++++++++++++------- mm/zswap.c | 20 +++++ 10 files changed, 394 insertions(+), 27 deletions(-) diff --git a/include/linux/fs.h b/include/linux/fs.h index 3559446279c1..2868925568a5 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -303,6 +303,8 @@ struct iattr { enum positive_aop_returns { AOP_WRITEPAGE_ACTIVATE = 0x80000, AOP_TRUNCATED_PAGE = 0x80001, + AOP_PAGE_BATCHED = 0x80002, + AOP_PAGE_BATCH_SUCCESS = 0x80003, }; /* diff --git a/include/linux/mm.h b/include/linux/mm.h index 4c32003c8404..a8035e163793 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -80,6 +80,14 @@ extern void * high_memory; extern int page_cluster; extern const int page_cluster_max; +/* + * Compress batching of any-order folios in the reclaim path with IAA. + * The number of folios to batch reclaim can be set through + * "sysctl vm.compress-batchsize" which can be a value in [1, 32]. + */ +extern int compress_batchsize; +extern const int compress_batchsize_max; + #ifdef CONFIG_SYSCTL extern int sysctl_legacy_va_layout; #else diff --git a/include/linux/writeback.h b/include/linux/writeback.h index d6db822e4bb3..41629ea5699d 100644 --- a/include/linux/writeback.h +++ b/include/linux/writeback.h @@ -82,6 +82,11 @@ struct writeback_control { /* Target list for splitting a large folio */ struct list_head *list; + /* + * Plug for storing reclaim folios for compress batching. + */ + struct swap_in_memory_cache_cb *swap_in_memory_cache_plug; + /* internal fields used by the ->writepages implementation: */ struct folio_batch fbatch; pgoff_t index; diff --git a/include/linux/zswap.h b/include/linux/zswap.h index 9bbe330686f6..328a1e09d502 100644 --- a/include/linux/zswap.h +++ b/include/linux/zswap.h @@ -11,6 +11,8 @@ extern atomic_long_t zswap_stored_pages; #ifdef CONFIG_ZSWAP +struct swap_in_memory_cache_cb; + struct zswap_lruvec_state { /* * Number of swapped in pages from disk, i.e not found in the zswap pool. @@ -107,6 +109,15 @@ struct zswap_store_pipeline_state { }; bool zswap_store_batching_enabled(void); +void __zswap_store_batch(struct swap_in_memory_cache_cb *simc); +void __zswap_store_batch_single(struct swap_in_memory_cache_cb *simc); +static inline void zswap_store_batch(struct swap_in_memory_cache_cb *simc) +{ + if (zswap_store_batching_enabled()) + __zswap_store_batch(simc); + else + __zswap_store_batch_single(simc); +} unsigned long zswap_total_pages(void); bool zswap_store(struct folio *folio); bool zswap_load(struct folio *folio); @@ -123,12 +134,17 @@ bool zswap_never_enabled(void); struct zswap_lruvec_state {}; struct zswap_store_sub_batch_page {}; struct zswap_store_pipeline_state {}; +struct swap_in_memory_cache_cb; static inline bool zswap_store_batching_enabled(void) { return false; } +static inline void zswap_store_batch(struct swap_in_memory_cache_cb *simc) +{ +} + static inline bool zswap_store(struct folio *folio) { return false; diff --git a/kernel/sysctl.c b/kernel/sysctl.c index 79e6cb1d5c48..b8d6b599e9ae 100644 --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -2064,6 +2064,15 @@ static struct ctl_table vm_table[] = { .extra1 = SYSCTL_ZERO, .extra2 = (void *)&page_cluster_max, }, + { + .procname = "compress-batchsize", + .data = &compress_batchsize, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = proc_dointvec_minmax, + .extra1 = SYSCTL_ONE, + .extra2 = (void *)&compress_batchsize_max, + }, { .procname = "dirtytime_expire_seconds", .data = &dirtytime_expire_interval, diff --git a/mm/page_io.c b/mm/page_io.c index a28d28b6b3ce..065db25309b8 100644 --- a/mm/page_io.c +++ b/mm/page_io.c @@ -226,6 +226,131 @@ static void swap_zeromap_folio_clear(struct folio *folio) } } +/* + * For batching of folios in reclaim path for zswap batch compressions + * with Intel IAA. + */ +static void simc_write_in_memory_cache_complete( + struct swap_in_memory_cache_cb *simc, + struct writeback_control *wbc) +{ + int i; + + /* All elements of a plug write batch have the same swap type. */ + struct swap_info_struct *sis = swp_swap_info(simc->folios[0]->swap); + + VM_BUG_ON(!sis); + + for (i = 0; i < simc->nr_folios; ++i) { + struct folio *folio = simc->folios[i]; + + if (!simc->errors[i]) { + count_mthp_stat(folio_order(folio), MTHP_STAT_ZSWPOUT); + folio_unlock(folio); + } else { + __swap_writepage(simc->folios[i], wbc); + } + } +} + +void swap_write_in_memory_cache_unplug(struct swap_in_memory_cache_cb *simc, + struct writeback_control *wbc) +{ + unsigned long pflags; + + psi_memstall_enter(&pflags); + + zswap_store_batch(simc); + + simc_write_in_memory_cache_complete(simc, wbc); + + psi_memstall_leave(&pflags); + + simc->processed = true; +} + +/* + * Only called by swap_writepage() if (wbc && wbc->swap_in_memory_cache_plug) + * is true i.e., from shrink_folio_list()->pageout() path. + */ +static bool swap_writepage_in_memory_cache(struct folio *folio, + struct writeback_control *wbc) +{ + struct swap_in_memory_cache_cb *simc; + unsigned type = swp_type(folio->swap); + int node_id = folio_nid(folio); + int comp_batch_size = READ_ONCE(compress_batchsize); + bool ret = false; + + simc = wbc->swap_in_memory_cache_plug; + + if ((simc->nr_folios > 0) && + ((simc->type != type) || (simc->node_id != node_id) || + folio_test_pmd_mappable(folio) || + (simc->nr_folios == comp_batch_size))) { + swap_write_in_memory_cache_unplug(simc, wbc); + ret = true; + simc->next_batch_folio = folio; + } else { + simc->type = type; + simc->node_id = node_id; + simc->folios[simc->nr_folios] = folio; + + /* + * If zswap successfully stores a page, it should set + * simc->errors[] to 0. + */ + simc->errors[simc->nr_folios] = -1; + simc->nr_folios++; + } + + return ret; +} + +void swap_writepage_in_memory_cache_transition(void *arg) +{ + struct swap_in_memory_cache_cb *simc = + (struct swap_in_memory_cache_cb *) arg; + simc->nr_folios = 0; + simc->processed = false; + + if (simc->next_batch_folio) { + struct folio *folio = simc->next_batch_folio; + simc->folios[simc->nr_folios] = folio; + simc->type = swp_type(folio->swap); + simc->node_id = folio_nid(folio); + simc->next_batch_folio = NULL; + + /* + * If zswap successfully stores a page, it should set + * simc->errors[] to 0. + */ + simc->errors[simc->nr_folios] = -1; + simc->nr_folios++; + } +} + +void swap_writepage_in_memory_cache_init(void *arg) +{ + struct swap_in_memory_cache_cb *simc = + (struct swap_in_memory_cache_cb *) arg; + + simc->nr_folios = 0; + simc->processed = false; + simc->next_batch_folio = NULL; + simc->transition = &swap_writepage_in_memory_cache_transition; +} + +/* + * zswap batching of folios with IAA: + * + * Reclaim batching note for pmd-mappable folios: + * Any pmd-mappable folio in the reclaim path will be processed in a batch + * comprising only that folio. There will be no mixed batches containing + * pmd-mappable folios for batch compression with IAA. + * There are no restrictions with other large folios: a reclaim batch + * can comprise of any-order mix of non-pmd-mappable folios. + */ /* * We may have stale swap cache pages in memory: notice * them here and get rid of the unnecessary final write. @@ -268,7 +393,32 @@ int swap_writepage(struct page *page, struct writeback_control *wbc) */ swap_zeromap_folio_clear(folio); } - if (zswap_store(folio)) { + + /* + * Batching of compressions with IAA: If reclaim path pageout has + * invoked swap_writepage with a wbc->swap_in_memory_cache_plug, + * add the page to the plug, or invoke zswap_store_batch() if + * "vm.compress-batchsize" elements have been stored in + * the plug. + * + * If swap_writepage has been called from other kernel code without + * a wbc->swap_in_memory_cache_plug, call zswap_store() with the folio + * (i.e. without adding the folio to a plug for batch processing). + */ + if (wbc && wbc->swap_in_memory_cache_plug) { + if (!mem_cgroup_zswap_writeback_enabled(folio_memcg(folio)) && + !zswap_is_enabled() && + folio_memcg(folio) && + !READ_ONCE(folio_memcg(folio)->zswap_writeback)) { + folio_mark_dirty(folio); + return AOP_WRITEPAGE_ACTIVATE; + } + + if (swap_writepage_in_memory_cache(folio, wbc)) + return AOP_PAGE_BATCH_SUCCESS; + else + return AOP_PAGE_BATCHED; + } else if (zswap_store(folio)) { count_mthp_stat(folio_order(folio), MTHP_STAT_ZSWPOUT); folio_unlock(folio); return 0; diff --git a/mm/swap.c b/mm/swap.c index 835bdf324b76..095630d6c35e 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -38,6 +38,7 @@ #include #include +#include "swap.h" #include "internal.h" #define CREATE_TRACE_POINTS @@ -47,6 +48,14 @@ int page_cluster; const int page_cluster_max = 31; +/* + * Number of pages in a reclaim batch for pageout. + * If zswap is enabled, this is the batch-size for zswap + * compress batching of multiple any-order folios. + */ +int compress_batchsize; +const int compress_batchsize_max = SWAP_CRYPTO_MAX_COMP_BATCH_SIZE; + struct cpu_fbatches { /* * The following folio batches are grouped together because they are protected @@ -1105,4 +1114,10 @@ void __init swap_setup(void) * Right now other parts of the system means that we * _really_ don't want to cluster much more */ + + /* + * Initialize the number of pages in a reclaim batch + * for pageout. + */ + compress_batchsize = 1; } diff --git a/mm/swap.h b/mm/swap.h index 4dcb67e2cc33..08c04954304f 100644 --- a/mm/swap.h +++ b/mm/swap.h @@ -20,6 +20,13 @@ struct mempolicy; #define SWAP_CRYPTO_SUB_BATCH_SIZE 1UL #endif +/* Set the vm.compress-batchsize limits. */ +#if defined(CONFIG_ZSWAP_STORE_BATCHING_ENABLED) +#define SWAP_CRYPTO_MAX_COMP_BATCH_SIZE SWAP_CLUSTER_MAX +#else +#define SWAP_CRYPTO_MAX_COMP_BATCH_SIZE 1UL +#endif + /* linux/mm/swap_state.c, zswap.c */ struct crypto_acomp_ctx { struct crypto_acomp *acomp; @@ -53,6 +60,20 @@ void swap_crypto_acomp_compress_batch( int nr_pages, struct crypto_acomp_ctx *acomp_ctx); +/* linux/mm/vmscan.c, linux/mm/page_io.c, linux/mm/zswap.c */ +/* For batching of compressions in reclaim path. */ +struct swap_in_memory_cache_cb { + unsigned int type; + int node_id; + struct folio *folios[SWAP_CLUSTER_MAX]; + int errors[SWAP_CLUSTER_MAX]; + unsigned int nr_folios; + bool processed; + struct folio *next_batch_folio; + void (*transition)(void *); + void (*init)(void *); +}; + /* linux/mm/page_io.c */ int sio_pool_init(void); struct swap_iocb; @@ -63,6 +84,10 @@ static inline void swap_read_unplug(struct swap_iocb *plug) if (unlikely(plug)) __swap_read_unplug(plug); } +void swap_writepage_in_memory_cache_init(void *arg); +void swap_writepage_in_memory_cache_transition(void *arg); +void swap_write_in_memory_cache_unplug(struct swap_in_memory_cache_cb *simc, + struct writeback_control *wbc); void swap_write_unplug(struct swap_iocb *sio); int swap_writepage(struct page *page, struct writeback_control *wbc); void __swap_writepage(struct folio *folio, struct writeback_control *wbc); @@ -164,6 +189,21 @@ static inline void swap_crypto_acomp_compress_batch( { } +struct swap_in_memory_cache_cb {}; +static void swap_writepage_in_memory_cache_init(void *arg) +{ +} + +static void swap_writepage_in_memory_cache_transition(void *arg) +{ +} + +static inline void swap_write_in_memory_cache_unplug( + struct swap_in_memory_cache_cb *simc, + struct writeback_control *wbc) +{ +} + static inline void swap_read_folio(struct folio *folio, struct swap_iocb **plug) { } diff --git a/mm/vmscan.c b/mm/vmscan.c index fd3908d43b07..145e6cde78cd 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -619,6 +619,13 @@ typedef enum { PAGE_ACTIVATE, /* folio has been sent to the disk successfully, folio is unlocked */ PAGE_SUCCESS, + /* + * reclaim folio batch has been sent to swap successfully, + * folios are unlocked + */ + PAGE_BATCH_SUCCESS, + /* folio has been added to the reclaim batch. */ + PAGE_BATCHED, /* folio is clean and locked */ PAGE_CLEAN, } pageout_t; @@ -628,7 +635,8 @@ typedef enum { * Calls ->writepage(). */ static pageout_t pageout(struct folio *folio, struct address_space *mapping, - struct swap_iocb **plug, struct list_head *folio_list) + struct swap_iocb **plug, struct list_head *folio_list, + struct swap_in_memory_cache_cb *imc_plug) { /* * If the folio is dirty, only perform writeback if that write @@ -674,6 +682,7 @@ static pageout_t pageout(struct folio *folio, struct address_space *mapping, .range_end = LLONG_MAX, .for_reclaim = 1, .swap_plug = plug, + .swap_in_memory_cache_plug = imc_plug, }; /* @@ -693,6 +702,23 @@ static pageout_t pageout(struct folio *folio, struct address_space *mapping, return PAGE_ACTIVATE; } + if (res == AOP_PAGE_BATCHED) + return PAGE_BATCHED; + + if (res == AOP_PAGE_BATCH_SUCCESS) { + int r; + for (r = 0; r < imc_plug->nr_folios; ++r) { + struct folio *rfolio = imc_plug->folios[r]; + if (!folio_test_writeback(rfolio)) { + /* synchronous write or broken a_ops? */ + folio_clear_reclaim(rfolio); + } + trace_mm_vmscan_write_folio(rfolio); + node_stat_add_folio(rfolio, NR_VMSCAN_WRITE); + } + return PAGE_BATCH_SUCCESS; + } + if (!folio_test_writeback(folio)) { /* synchronous write or broken a_ops? */ folio_clear_reclaim(folio); @@ -1035,6 +1061,12 @@ static bool may_enter_fs(struct folio *folio, gfp_t gfp_mask) return !data_race(folio_swap_flags(folio) & SWP_FS_OPS); } +static __always_inline bool reclaim_batch_being_processed( + struct swap_in_memory_cache_cb *imc_plug) +{ + return imc_plug->nr_folios && imc_plug->processed; +} + /* * shrink_folio_list() returns the number of reclaimed pages */ @@ -1049,22 +1081,54 @@ static unsigned int shrink_folio_list(struct list_head *folio_list, unsigned int pgactivate = 0; bool do_demote_pass; struct swap_iocb *plug = NULL; + struct swap_in_memory_cache_cb imc_plug; + bool imc_plug_path = false; + struct folio *folio; + int r; + imc_plug.init = &swap_writepage_in_memory_cache_init; + imc_plug.init(&imc_plug); folio_batch_init(&free_folios); memset(stat, 0, sizeof(*stat)); cond_resched(); do_demote_pass = can_demote(pgdat->node_id, sc); retry: - while (!list_empty(folio_list)) { + while (!list_empty(folio_list) || (imc_plug.nr_folios && !imc_plug.processed)) { struct address_space *mapping; - struct folio *folio; enum folio_references references = FOLIOREF_RECLAIM; bool dirty, writeback; unsigned int nr_pages; + imc_plug_path = false; cond_resched(); + /* Reclaim path zswap/zram batching using IAA. */ + if (list_empty(folio_list)) { + struct writeback_control wbc = { + .sync_mode = WB_SYNC_NONE, + .nr_to_write = SWAP_CLUSTER_MAX, + .range_start = 0, + .range_end = LLONG_MAX, + .for_reclaim = 1, + .swap_plug = &plug, + .swap_in_memory_cache_plug = &imc_plug, + }; + + swap_write_in_memory_cache_unplug(&imc_plug, &wbc); + + for (r = 0; r < imc_plug.nr_folios; ++r) { + struct folio *rfolio = imc_plug.folios[r]; + if (!folio_test_writeback(rfolio)) { + /* synchronous write or broken a_ops? */ + folio_clear_reclaim(rfolio); + } + trace_mm_vmscan_write_folio(rfolio); + node_stat_add_folio(rfolio, NR_VMSCAN_WRITE); + } + goto serialize_post_batch_pageout; + } + folio = lru_to_folio(folio_list); list_del(&folio->lru); @@ -1363,7 +1427,7 @@ static unsigned int shrink_folio_list(struct list_head *folio_list, * starts and then write it out here. */ try_to_unmap_flush_dirty(); - switch (pageout(folio, mapping, &plug, folio_list)) { + switch (pageout(folio, mapping, &plug, folio_list, &imc_plug)) { case PAGE_KEEP: goto keep_locked; case PAGE_ACTIVATE: @@ -1377,34 +1441,66 @@ static unsigned int shrink_folio_list(struct list_head *folio_list, nr_pages = 1; } goto activate_locked; + case PAGE_BATCHED: + continue; case PAGE_SUCCESS: - if (nr_pages > 1 && !folio_test_large(folio)) { - sc->nr_scanned -= (nr_pages - 1); - nr_pages = 1; - } - stat->nr_pageout += nr_pages; - - if (folio_test_writeback(folio)) - goto keep; - if (folio_test_dirty(folio)) - goto keep; - - /* - * A synchronous write - probably a ramdisk. Go - * ahead and try to reclaim the folio. - */ - if (!folio_trylock(folio)) - goto keep; - if (folio_test_dirty(folio) || - folio_test_writeback(folio)) - goto keep_locked; - mapping = folio_mapping(folio); - fallthrough; + goto post_single_pageout; + case PAGE_BATCH_SUCCESS: + goto serialize_post_batch_pageout; case PAGE_CLEAN: + goto folio_is_clean; ; /* try to free the folio below */ } + } else { + goto folio_is_clean; + } + +serialize_post_batch_pageout: + imc_plug_path = reclaim_batch_being_processed(&imc_plug); + if (!imc_plug_path) { + pr_err("imc_plug: type %u node_id %d \ + nr_folios %u processed %d next_batch_folio %px", + imc_plug.type, imc_plug.node_id, + imc_plug.nr_folios, imc_plug.processed, + imc_plug.next_batch_folio); + } + BUG_ON(!imc_plug_path); + r = -1; + +next_folio_in_batch: + while (++r < imc_plug.nr_folios) { + folio = imc_plug.folios[r]; + goto post_single_pageout; + } /* while imc_plug folios. */ + + imc_plug.transition(&imc_plug); + continue; + +post_single_pageout: + mapping = folio_mapping(folio); + nr_pages = folio_nr_pages(folio); + if (nr_pages > 1 && !folio_test_large(folio)) { + sc->nr_scanned -= (nr_pages - 1); + nr_pages = 1; } + stat->nr_pageout += nr_pages; + + if (folio_test_writeback(folio)) + goto keep; + if (folio_test_dirty(folio)) + goto keep; + + /* + * A synchronous write - probably a ramdisk. Go + * ahead and try to reclaim the folio. + */ + if (!folio_trylock(folio)) + goto keep; + if (folio_test_dirty(folio) || + folio_test_writeback(folio)) + goto keep_locked; +folio_is_clean: /* * If the folio has buffers, try to free the buffer * mappings associated with this folio. If we succeed @@ -1444,6 +1540,8 @@ static unsigned int shrink_folio_list(struct list_head *folio_list, * leave it off the LRU). */ nr_reclaimed += nr_pages; + if (imc_plug_path) + goto next_folio_in_batch; continue; } } @@ -1481,6 +1579,8 @@ static unsigned int shrink_folio_list(struct list_head *folio_list, try_to_unmap_flush(); free_unref_folios(&free_folios); } + if (imc_plug_path) + goto next_folio_in_batch; continue; activate_locked_split: @@ -1510,6 +1610,8 @@ static unsigned int shrink_folio_list(struct list_head *folio_list, list_add(&folio->lru, &ret_folios); VM_BUG_ON_FOLIO(folio_test_lru(folio) || folio_test_unevictable(folio), folio); + if (imc_plug_path) + goto next_folio_in_batch; } /* 'folio_list' is always empty here */ diff --git a/mm/zswap.c b/mm/zswap.c index 1c12a7b9f4ff..68ce498ad000 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -1666,6 +1666,26 @@ bool zswap_store(struct folio *folio) return ret; } +/* + * The batch contains <= vm.compress-batchsize nr of folios. + * All folios in the batch have the same swap type and folio_nid. + */ +void __zswap_store_batch(struct swap_in_memory_cache_cb *simc) +{ + __zswap_store_batch_core(simc->node_id, simc->folios, + simc->errors, simc->nr_folios); +} + +void __zswap_store_batch_single(struct swap_in_memory_cache_cb *simc) +{ + u8 i; + + for (i = 0; i < simc->nr_folios; ++i) { + if (zswap_store(simc->folios[i])) + simc->errors[i] = 0; + } +} + /* * Note: If SWAP_CRYPTO_SUB_BATCH_SIZE exceeds 256, change the * u8 stack variables in the next several functions, to u16.