From patchwork Fri Oct 18 06:40:49 2024
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: "Sridhar, Kanchana P" <kanchana.p.sridhar@intel.com>
X-Patchwork-Id: 13841228
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id A7A19D3C54C
	for <linux-mm@archiver.kernel.org>; Fri, 18 Oct 2024 06:41:08 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id 39F286B0085; Fri, 18 Oct 2024 02:41:07 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 329046B008A; Fri, 18 Oct 2024 02:41:07 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 01E946B0088; Fri, 18 Oct 2024 02:41:06 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com
 [216.40.44.17])
	by kanga.kvack.org (Postfix) with ESMTP id D41456B0085
	for <linux-mm@kvack.org>; Fri, 18 Oct 2024 02:41:06 -0400 (EDT)
Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay10.hostedemail.com (Postfix) with ESMTP id 014B1C173E
	for <linux-mm@kvack.org>; Fri, 18 Oct 2024 06:40:53 +0000 (UTC)
X-FDA: 82685775270.08.1031922
Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15])
	by imf27.hostedemail.com (Postfix) with ESMTP id 9621940005
	for <linux-mm@kvack.org>; Fri, 18 Oct 2024 06:40:53 +0000 (UTC)
Authentication-Results: imf27.hostedemail.com;
	dkim=pass header.d=intel.com header.s=Intel header.b=T6RzztvN;
	dmarc=pass (policy=none) header.from=intel.com;
	spf=pass (imf27.hostedemail.com: domain of kanchana.p.sridhar@intel.com
 designates 192.198.163.15 as permitted sender)
 smtp.mailfrom=kanchana.p.sridhar@intel.com
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729233545; a=rsa-sha256;
	cv=none;
	b=BXRmj6V3obXnAFbFjdzFFFJrK8swgqHMpc9CTBLOQj60+4h2zeOgbnKwtWbq6NTa4gjRwF
	KUAvsT1uROoz810DijEn4HMfpMhdljXY6yJYdbIbp4C+sshKWZ7/MzhQFczr00v3ufHadM
	KG3B0ycPqKVsSAo7DlUlbVUZKP8qKBc=
ARC-Authentication-Results: i=1;
	imf27.hostedemail.com;
	dkim=pass header.d=intel.com header.s=Intel header.b=T6RzztvN;
	dmarc=pass (policy=none) header.from=intel.com;
	spf=pass (imf27.hostedemail.com: domain of kanchana.p.sridhar@intel.com
 designates 192.198.163.15 as permitted sender)
 smtp.mailfrom=kanchana.p.sridhar@intel.com
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed;
 d=hostedemail.com;
	s=arc-20220608; t=1729233545;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references:dkim-signature;
	bh=5uPDs9Ny1P2DVNOPHwMNKagj/oUSxXThkw3t9wLua2A=;
	b=3BqmMAUKhDtt2u1bgCpalglmCWoZHBbhAn9uFTzy63Gh0tbxeLG5tdBxtjlvQ5m4MEvMfS
	m72bAswk/xDbobUZcul5q7gnAIy8i0tT5YlVnTiaZdw6/4NkyFD9SokAwZk0YXAHXU/mGC
	IwFHZS91RCkuIrdXoadAWZJYKF4qT2U=
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;
  d=intel.com; i=@intel.com; q=dns/txt; s=Intel;
  t=1729233664; x=1760769664;
  h=from:to:cc:subject:date:message-id:in-reply-to:
   references:mime-version:content-transfer-encoding;
  bh=0lpsj1UvR6AcnHgvukFN7BtQLeSUmvJMMadL8XBpSFM=;
  b=T6RzztvNKL4z6FnpNAdlXRbdPG5W+jid9EH+1WysK7m35wRpnS/FzTjA
   M8evfOAQgAjrKGKtY5/RxmAAFD6gKTmXkvGfQ8JLUk0fBkIGunylGS15o
   PEeF9ONWgFpJLT46hdO9tPdMNhDP+wt1M9lVCcuYTLZCq3kj7yv7LmxuX
   oZk4KTSdehX6H9VbNu/sxFL83ZwiM6gMzVmbDkxMvxmmpo1FmSoKQWV7x
   VLSCu5iAPFc6oHInSnoqITVMftcR+7r4an617mHs1rAwGeKH4PnmdsYAN
   ZreT3vA7wOKuNTrqG5qdjpqJh0FauMs9/Mtow4EB/hNeAJ5oZX1lv4oNt
   w==;
X-CSE-ConnectionGUID: NN5/VhBNRgijujjD3gBH+g==
X-CSE-MsgGUID: /ckLpereQtOYWyQHWTlg3A==
X-IronPort-AV: E=McAfee;i="6700,10204,11228"; a="28884782"
X-IronPort-AV: E=Sophos;i="6.11,212,1725346800";
   d="scan'208";a="28884782"
Received: from orviesa003.jf.intel.com ([10.64.159.143])
  by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 17 Oct 2024 23:41:02 -0700
X-CSE-ConnectionGUID: EXdOqHO4QOiTM5LpwBoMEQ==
X-CSE-MsgGUID: X736q6ymTDK46QF8X8oAwg==
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="6.11,199,1725346800";
   d="scan'208";a="83607487"
Received: from jf5300-b11a338t.jf.intel.com ([10.242.51.6])
  by orviesa003.jf.intel.com with ESMTP; 17 Oct 2024 23:41:01 -0700
From: Kanchana P Sridhar <kanchana.p.sridhar@intel.com>
To: linux-kernel@vger.kernel.org,
	linux-mm@kvack.org,
	hannes@cmpxchg.org,
	yosryahmed@google.com,
	nphamcs@gmail.com,
	chengming.zhou@linux.dev,
	usamaarif642@gmail.com,
	ryan.roberts@arm.com,
	ying.huang@intel.com,
	21cnbao@gmail.com,
	akpm@linux-foundation.org,
	linux-crypto@vger.kernel.org,
	herbert@gondor.apana.org.au,
	davem@davemloft.net,
	clabbe@baylibre.com,
	ardb@kernel.org,
	ebiggers@google.com,
	surenb@google.com,
	kristen.c.accardi@intel.com,
	zanussi@kernel.org,
	viro@zeniv.linux.org.uk,
	brauner@kernel.org,
	jack@suse.cz,
	mcgrof@kernel.org,
	kees@kernel.org,
	joel.granados@kernel.org,
	bfoster@redhat.com,
	willy@infradead.org,
	linux-fsdevel@vger.kernel.org
Cc: wajdi.k.feghali@intel.com,
	vinodh.gopal@intel.com,
	kanchana.p.sridhar@intel.com
Subject: [RFC PATCH v1 01/13] crypto: acomp - Add a poll() operation to
 acomp_alg and acomp_req
Date: Thu, 17 Oct 2024 23:40:49 -0700
Message-Id: <20241018064101.336232-2-kanchana.p.sridhar@intel.com>
X-Mailer: git-send-email 2.27.0
In-Reply-To: <20241018064101.336232-1-kanchana.p.sridhar@intel.com>
References: <20241018064101.336232-1-kanchana.p.sridhar@intel.com>
MIME-Version: 1.0
X-Rspamd-Server: rspam12
X-Rspamd-Queue-Id: 9621940005
X-Stat-Signature: g1qngc9wrq7hf9ece15s7pi4e1959ob9
X-Rspam-User: 
X-HE-Tag: 1729233653-368533
X-HE-Meta: 
 U2FsdGVkX180FIFhkNjk0kVKLCe5zyyp3CPvHX2+HyRvlA281XjuUUcKXjLP/ABAhpD+xiMTIK1SYDTbpuCSSaN4KNTHbYneQsvgYUObsGLR+SAdf916Ra16Hg+vx3XDlcmENBdYG6M4/oRdd+Zy4Z59IhDB3UUliTYha9pAkddkgPNk31OGkChUooGntb9rlcTJ6B7sbHk0IPagfpoZ3iL0jUKrgQ9Tr1pcY6B730S8xDos4J4ATgPSeGriZ1rNrsVl3+t2v8GX+t+HBHn2IArLnu5KXLhyi/iqyTN+39EnJ8iWB1XfTnTBhEcN1h+qFQpakPJQw+JuYmfK3GPXd286uBU8+7/mCXArCg/BTtlT5dhOHc/vMve22jWI2N9VbgYXDIyhD5GHPjwlPuS+Ao8KfyOuRtq40U43LUOSEy8cpWQgxrz7EBBu4E4VvsmnevPZrN8SquPdMIxaT2UPk+YDDfNenfIwroPDshRX1IAtT5VYoDfUW8WgdKq/CDXJDOIUNJaBZ26WbQvLCMt6xwyDhlvOPuCAmH44fPtl4F/9jSdtvvC27hNX6mQOeVtod7WM/nrnDUXSykJ88crb1dXmTJEZYvSESFjm3wNZ2/jvkOp3LlowoCS12gOFyW3q7NQBHzuHrlVwh8qPFmSWWcmepHvYAAVFXsNzr9BXbCiIISQ2WNW1UjfvgQJ1gDl1vLCMSVMHud56SZ0P17JkUDjQyJV0TffQHMBP2sJAgLHLYLlRVpBuAdHoPggAP5gUU/fzNXCcO/UGwCpsFxs7sQ+xAegMIAfc2B9l6/02GrYQ/R9GPUj5UtSxJ2slcDswj/0aWNA1ync3wnz2vZoqhNGAv1Wf9aI8adhhbLx94hKXPQ7NvGjsz/XVzwY4PkqjnUgy1ieyXV9jHiQ1QuO33SgOvNTdfAg+3Rsg31eoBYAsbtXoxXS61yIWJmvh+UA2zAuunDK9xv1mEjw5Wph
 F34oWI73
 clrTH5Vtlziny1qBs4Z49I2GEoMIB08JJbZMDYumPrgNhooiYGs1RAgljpDhe8S1rvDE6vfMWkjqGC/piwoWXmx+Bda1vfVgse8IOBDYpG1uctHoRkGxaFAR6XSYF0FlDm7+V0V01zbDkkYbdDt1QLxaIMYKZ7X4JmxZDsL7zjCCSNLwV8SkIp6XAzL51zxQB5bLiKwHEwFtZNDS9BKYhdtFcQessL6ZhFUBt
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>
List-Subscribe: <mailto:majordomo@kvack.org>
List-Unsubscribe: <mailto:majordomo@kvack.org>

For async compress/decompress, provide a way for the caller to poll
for compress/decompress completion, rather than wait for an interrupt
to signal completion.

Callers can submit a compress/decompress using crypto_acomp_compress
and decompress and rather than wait on a completion, call
crypto_acomp_poll() to check for completion.

This is useful for hardware accelerators where the overhead of
interrupts and waiting for completions is too expensive.  Typically
the compress/decompress hw operations complete very quickly and in the
vast majority of cases, adding the overhead of interrupt handling and
waiting for completions simply adds unnecessary delays and cancels the
gains of using the hw acceleration.

Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Kanchana P Sridhar <kanchana.p.sridhar@intel.com>
---
 crypto/acompress.c                  |  1 +
 include/crypto/acompress.h          | 18 ++++++++++++++++++
 include/crypto/internal/acompress.h |  1 +
 3 files changed, 20 insertions(+)

diff --git a/crypto/acompress.c b/crypto/acompress.c
index 6fdf0ff9f3c0..00ec7faa2714 100644
--- a/crypto/acompress.c
+++ b/crypto/acompress.c
@@ -71,6 +71,7 @@ static int crypto_acomp_init_tfm(struct crypto_tfm *tfm)
 
 	acomp->compress = alg->compress;
 	acomp->decompress = alg->decompress;
+	acomp->poll = alg->poll;
 	acomp->dst_free = alg->dst_free;
 	acomp->reqsize = alg->reqsize;
 
diff --git a/include/crypto/acompress.h b/include/crypto/acompress.h
index 54937b615239..65b5de30c8b1 100644
--- a/include/crypto/acompress.h
+++ b/include/crypto/acompress.h
@@ -51,6 +51,7 @@ struct acomp_req {
 struct crypto_acomp {
 	int (*compress)(struct acomp_req *req);
 	int (*decompress)(struct acomp_req *req);
+	int (*poll)(struct acomp_req *req);
 	void (*dst_free)(struct scatterlist *dst);
 	unsigned int reqsize;
 	struct crypto_tfm base;
@@ -265,4 +266,21 @@ static inline int crypto_acomp_decompress(struct acomp_req *req)
 	return crypto_acomp_reqtfm(req)->decompress(req);
 }
 
+/**
+ * crypto_acomp_poll() -- Invoke asynchronous poll operation
+ *
+ * Function invokes the asynchronous poll operation
+ *
+ * @req:	asynchronous request
+ *
+ * Return: zero on poll completion, -EAGAIN if not complete, or
+ *         error code in case of error
+ */
+static inline int crypto_acomp_poll(struct acomp_req *req)
+{
+	struct crypto_acomp *tfm = crypto_acomp_reqtfm(req);
+
+	return tfm->poll(req);
+}
+
 #endif
diff --git a/include/crypto/internal/acompress.h b/include/crypto/internal/acompress.h
index 8831edaafc05..fbf5f6c6eeb6 100644
--- a/include/crypto/internal/acompress.h
+++ b/include/crypto/internal/acompress.h
@@ -37,6 +37,7 @@
 struct acomp_alg {
 	int (*compress)(struct acomp_req *req);
 	int (*decompress)(struct acomp_req *req);
+	int (*poll)(struct acomp_req *req);
 	void (*dst_free)(struct scatterlist *dst);
 	int (*init)(struct crypto_acomp *tfm);
 	void (*exit)(struct crypto_acomp *tfm);

From patchwork Fri Oct 18 06:40:50 2024
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: "Sridhar, Kanchana P" <kanchana.p.sridhar@intel.com>
X-Patchwork-Id: 13841229
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id D3A64D3C551
	for <linux-mm@archiver.kernel.org>; Fri, 18 Oct 2024 06:41:10 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id B1C7D6B0088; Fri, 18 Oct 2024 02:41:08 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id ACD296B0089; Fri, 18 Oct 2024 02:41:08 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 8F8176B008A; Fri, 18 Oct 2024 02:41:08 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com
 [216.40.44.11])
	by kanga.kvack.org (Postfix) with ESMTP id 6C27E6B0088
	for <linux-mm@kvack.org>; Fri, 18 Oct 2024 02:41:08 -0400 (EDT)
Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay09.hostedemail.com (Postfix) with ESMTP id BFB83811E5
	for <linux-mm@kvack.org>; Fri, 18 Oct 2024 06:40:57 +0000 (UTC)
X-FDA: 82685775480.30.59D2584
Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15])
	by imf19.hostedemail.com (Postfix) with ESMTP id 66D741A000C
	for <linux-mm@kvack.org>; Fri, 18 Oct 2024 06:40:52 +0000 (UTC)
Authentication-Results: imf19.hostedemail.com;
	dkim=pass header.d=intel.com header.s=Intel header.b=EuiHOFsh;
	spf=pass (imf19.hostedemail.com: domain of kanchana.p.sridhar@intel.com
 designates 192.198.163.15 as permitted sender)
 smtp.mailfrom=kanchana.p.sridhar@intel.com;
	dmarc=pass (policy=none) header.from=intel.com
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed;
 d=hostedemail.com;
	s=arc-20220608; t=1729233519;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references:dkim-signature;
	bh=3n7vBZ2YHqBQ19A05vmbe5pZrtJfejWOVervqy3sijA=;
	b=e22UtLizRz63bJisWRCoBG65mDdMkPtOlLqav922DVvhxvBE0SjrL9kSUszw8yWHiU9Mni
	0yoZVRwljlB5nMS3Goy4QiMu48nK9LCx483imvE1osTb1h3XMpxBfGzUP+RMziZJfyv5dC
	cJmH6z1pCpCyK0DqM18qVdtoRvHN05o=
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729233519; a=rsa-sha256;
	cv=none;
	b=sTaNjVCq9Ivi8FdhoWG1vGcdD+O7s+jZy3a0Abla/yxlNT9Q00+sbivtK4blyzKS8tHUb6
	OZ+27OnMIOSIBHL8i0g5Cw3SXq+2Y3Jo1zNhuzqCpzsCs5NTyLVMfR33fC8vvzXXLa8q91
	wDOssq9Og6PgoPW09QFJ9yxpPyp/Dgw=
ARC-Authentication-Results: i=1;
	imf19.hostedemail.com;
	dkim=pass header.d=intel.com header.s=Intel header.b=EuiHOFsh;
	spf=pass (imf19.hostedemail.com: domain of kanchana.p.sridhar@intel.com
 designates 192.198.163.15 as permitted sender)
 smtp.mailfrom=kanchana.p.sridhar@intel.com;
	dmarc=pass (policy=none) header.from=intel.com
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;
  d=intel.com; i=@intel.com; q=dns/txt; s=Intel;
  t=1729233666; x=1760769666;
  h=from:to:cc:subject:date:message-id:in-reply-to:
   references:mime-version:content-transfer-encoding;
  bh=LlOfKWFmMimgOZRHZFBWIeDtQDuVJjBoFm4a8ckUAcs=;
  b=EuiHOFsh64wnXr4Fcl2k9knPJAsdnOkM7ux1x8C9guvaOyOFC2yc8RpQ
   utxgSJb1fRKit2yE2CT9HuSU0QDvU3FBl3kwG9C7AI79H+nxFpseFao6H
   QHZ9SmXJ05r9PzqXCLW+LiVM9ba1Ovy5bAe5US19Mx1sd9niS+M0M1X/6
   R5M+HZLTorNuk9gVMe/ljBODFyGtZ8R7q1tGvsv8YeIcJ6JTeKsFsuhMq
   UO7+sF+vqGmgn6Q9V8HCj4b04osZXQGyOeJ0G7uodVzBCc1Hyu4lMbszZ
   a+Qej7/g9+JvEiMmZn1AMGCaZFk54m5Krso915G2mWk3fLaEkt3EeHsKN
   Q==;
X-CSE-ConnectionGUID: h5tFPtTaTce/fp0AUi1Xhw==
X-CSE-MsgGUID: 8a6D1H2FS8qyhpp3dD48Tg==
X-IronPort-AV: E=McAfee;i="6700,10204,11228"; a="28884798"
X-IronPort-AV: E=Sophos;i="6.11,212,1725346800";
   d="scan'208";a="28884798"
Received: from orviesa003.jf.intel.com ([10.64.159.143])
  by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 17 Oct 2024 23:41:02 -0700
X-CSE-ConnectionGUID: gJKulEXVTRCuP5PYxKnngw==
X-CSE-MsgGUID: S19yrmp8R/uCrtwKGYX3Rw==
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="6.11,199,1725346800";
   d="scan'208";a="83607493"
Received: from jf5300-b11a338t.jf.intel.com ([10.242.51.6])
  by orviesa003.jf.intel.com with ESMTP; 17 Oct 2024 23:41:02 -0700
From: Kanchana P Sridhar <kanchana.p.sridhar@intel.com>
To: linux-kernel@vger.kernel.org,
	linux-mm@kvack.org,
	hannes@cmpxchg.org,
	yosryahmed@google.com,
	nphamcs@gmail.com,
	chengming.zhou@linux.dev,
	usamaarif642@gmail.com,
	ryan.roberts@arm.com,
	ying.huang@intel.com,
	21cnbao@gmail.com,
	akpm@linux-foundation.org,
	linux-crypto@vger.kernel.org,
	herbert@gondor.apana.org.au,
	davem@davemloft.net,
	clabbe@baylibre.com,
	ardb@kernel.org,
	ebiggers@google.com,
	surenb@google.com,
	kristen.c.accardi@intel.com,
	zanussi@kernel.org,
	viro@zeniv.linux.org.uk,
	brauner@kernel.org,
	jack@suse.cz,
	mcgrof@kernel.org,
	kees@kernel.org,
	joel.granados@kernel.org,
	bfoster@redhat.com,
	willy@infradead.org,
	linux-fsdevel@vger.kernel.org
Cc: wajdi.k.feghali@intel.com,
	vinodh.gopal@intel.com,
	kanchana.p.sridhar@intel.com
Subject: [RFC PATCH v1 02/13] crypto: iaa - Add support for irq-less crypto
 async interface
Date: Thu, 17 Oct 2024 23:40:50 -0700
Message-Id: <20241018064101.336232-3-kanchana.p.sridhar@intel.com>
X-Mailer: git-send-email 2.27.0
In-Reply-To: <20241018064101.336232-1-kanchana.p.sridhar@intel.com>
References: <20241018064101.336232-1-kanchana.p.sridhar@intel.com>
MIME-Version: 1.0
X-Stat-Signature: 7af3fx6wkq1ndbc9n47tr6w3efumkqce
X-Rspamd-Queue-Id: 66D741A000C
X-Rspam-User: 
X-Rspamd-Server: rspam08
X-HE-Tag: 1729233652-759413
X-HE-Meta: 
 U2FsdGVkX19WFfbzgJdPWjDs4nHTOoqONcZJPheQlCQ9Je6qwRy9+DbiDPaY0LMNhaSSujT5Q338KWpoPgQPvetarutZpExjXov6R0Oud2S3UocrZvvUEjqVh6zG2lkop8Ng7B0+6y0OhvBeqKRzTqYAkf9HeBRPjcLfwbGUstMww+i8WFFlCIFfNclZG5rkSq0+u8tPMvg3+RjuEhblEPC8q1rTZ8/LlJj0NDeWUcd0HOoNsYNoqdHV85fvoRMkHsXv6nQ9tTEZjgZ4qD5ogx4NS1k2guipl5tvCQQ63B0//1w11fE6XaQPkP0WNWlxC8jhejQZnd0PcCfnV4b4T+Stw8tVpuh00gDS0qOMkyX9cOJ5Q5dOVnEldNQP28KGTJI3Z8KE6Bz3wuUfYmPYBRhMK/T8rS6NyF1agk80scu+hEaROzwpEkydt+zCwt/I1s0TfPXTGEgwL4JeSOPTZ9c2S7OITKJRHoImTxUB0zjcH33YnAGmMf/rP5XSs4g+1kV3fmWUXovkuMC259y98V8VZoDXB8399yK3T0EsrrUkHb5hLsLJVyMeUdfjNxljvxfJ1ZWZShspEgEoISDvQk9iskg4H+7uany2KhfrXhVVwupGvlwQXpdcgVin8ejM1+HQZXM6CE8bdMkxkjgIw/54xPUJPrK8T8crwUjwPkirxtQH7F9mqcJyFT6rgvoM7bvE+zmWYOU7FVAgbzKQWGOgTGNtBWOYYLEc87o/fABu9kKFubUn6JkHUL7q6R0RBlS3l5P20EhsghNuZWrvkjFNHWc3h3c7zvJyTcaYOyB8t+7PNljrCs2ysc+UW5J0FpCMt2uKDHHddo8giehztt/kwgHUoyER+fpb2sFuvl9iHMssFHSFAHy1dLDUA3rQOBuOc4W0xhJ8PHHsi5ENkD8kY4MUi3LiKE1594/o7Gaj/+/14eE0mFQQdYJwUOooi8NSR7NxkFCoh7jSYS5
 CSJzcx5R
 nf1ADfE5uSQAjEIe9Unfk48LF9vS+XA7cfyDCyRQJsE3RdMpmcCfXgUpBa8Elh7uCyC/WiLKWMCsJKi2Nq7qRWPU3qifCZNGiVBhLXAK8CKffmhdI7Gqi82VKm6mlM8/Oc8sNsAApL7II2JX3Jb7v/4ibQdLMYOfUTyV5JMPIKqFGt7TYrqSE4jLQbA==
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>
List-Subscribe: <mailto:majordomo@kvack.org>
List-Unsubscribe: <mailto:majordomo@kvack.org>

Add a crypto acomp poll() implementation so that callers can use true
async iaa compress/decompress without interrupts.

To use this mode with zswap, select the 'async' iaa_crypto
driver_sync_mode:

  echo async > /sys/bus/dsa/drivers/crypto/sync_mode

This will cause the iaa_crypto driver to register its acomp_alg
implementation using a non-NULL poll() member, which callers such as
zswap can check for the presence of and use true async mode if found.

Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Kanchana P Sridhar <kanchana.p.sridhar@intel.com>
---
 drivers/crypto/intel/iaa/iaa_crypto_main.c | 74 ++++++++++++++++++++++
 1 file changed, 74 insertions(+)

diff --git a/drivers/crypto/intel/iaa/iaa_crypto_main.c b/drivers/crypto/intel/iaa/iaa_crypto_main.c
index 237f87000070..6a8577ac1330 100644
--- a/drivers/crypto/intel/iaa/iaa_crypto_main.c
+++ b/drivers/crypto/intel/iaa/iaa_crypto_main.c
@@ -1788,6 +1788,74 @@ static void compression_ctx_init(struct iaa_compression_ctx *ctx)
 	ctx->use_irq = use_irq;
 }
 
+static int iaa_comp_poll(struct acomp_req *req)
+{
+	struct idxd_desc *idxd_desc;
+	struct idxd_device *idxd;
+	struct iaa_wq *iaa_wq;
+	struct pci_dev *pdev;
+	struct device *dev;
+	struct idxd_wq *wq;
+	bool compress_op;
+	int ret;
+
+	idxd_desc = req->base.data;
+	if (!idxd_desc)
+		return -EAGAIN;
+
+	compress_op = (idxd_desc->iax_hw->opcode == IAX_OPCODE_COMPRESS);
+	wq = idxd_desc->wq;
+	iaa_wq = idxd_wq_get_private(wq);
+	idxd = iaa_wq->iaa_device->idxd;
+	pdev = idxd->pdev;
+	dev = &pdev->dev;
+
+	ret = check_completion(dev, idxd_desc->iax_completion, true, true);
+	if (ret == -EAGAIN)
+		return ret;
+	if (ret)
+		goto out;
+
+	req->dlen = idxd_desc->iax_completion->output_size;
+
+	/* Update stats */
+	if (compress_op) {
+		update_total_comp_bytes_out(req->dlen);
+		update_wq_comp_bytes(wq, req->dlen);
+	} else {
+		update_total_decomp_bytes_in(req->slen);
+		update_wq_decomp_bytes(wq, req->slen);
+	}
+
+	if (iaa_verify_compress && (idxd_desc->iax_hw->opcode == IAX_OPCODE_COMPRESS)) {
+		struct crypto_tfm *tfm = req->base.tfm;
+		dma_addr_t src_addr, dst_addr;
+		u32 compression_crc;
+
+		compression_crc = idxd_desc->iax_completion->crc;
+
+		dma_sync_sg_for_device(dev, req->dst, 1, DMA_FROM_DEVICE);
+		dma_sync_sg_for_device(dev, req->src, 1, DMA_TO_DEVICE);
+
+		src_addr = sg_dma_address(req->src);
+		dst_addr = sg_dma_address(req->dst);
+
+		ret = iaa_compress_verify(tfm, req, wq, src_addr, req->slen,
+					  dst_addr, &req->dlen, compression_crc);
+	}
+out:
+	/* caller doesn't call crypto_wait_req, so no acomp_request_complete() */
+
+	dma_unmap_sg(dev, req->dst, sg_nents(req->dst), DMA_FROM_DEVICE);
+	dma_unmap_sg(dev, req->src, sg_nents(req->src), DMA_TO_DEVICE);
+
+	idxd_free_desc(idxd_desc->wq, idxd_desc);
+
+	dev_dbg(dev, "%s: returning ret=%d\n", __func__, ret);
+
+	return ret;
+}
+
 static int iaa_comp_init_fixed(struct crypto_acomp *acomp_tfm)
 {
 	struct crypto_tfm *tfm = crypto_acomp_tfm(acomp_tfm);
@@ -1813,6 +1881,7 @@ static struct acomp_alg iaa_acomp_fixed_deflate = {
 	.compress		= iaa_comp_acompress,
 	.decompress		= iaa_comp_adecompress,
 	.dst_free               = dst_free,
+	.poll			= iaa_comp_poll,
 	.base			= {
 		.cra_name		= "deflate",
 		.cra_driver_name	= "deflate-iaa",
@@ -1827,6 +1896,11 @@ static int iaa_register_compression_device(void)
 {
 	int ret;
 
+	if (async_mode && !use_irq)
+		iaa_acomp_fixed_deflate.poll = iaa_comp_poll;
+	else
+		iaa_acomp_fixed_deflate.poll = NULL;
+
 	ret = crypto_register_acomp(&iaa_acomp_fixed_deflate);
 	if (ret) {
 		pr_err("deflate algorithm acomp fixed registration failed (%d)\n", ret);

From patchwork Fri Oct 18 06:40:51 2024
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: "Sridhar, Kanchana P" <kanchana.p.sridhar@intel.com>
X-Patchwork-Id: 13841231
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 86CFCD3C551
	for <linux-mm@archiver.kernel.org>; Fri, 18 Oct 2024 06:41:15 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id A362D6B008A; Fri, 18 Oct 2024 02:41:09 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 8FFC66B0095; Fri, 18 Oct 2024 02:41:09 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 575FD6B008C; Fri, 18 Oct 2024 02:41:09 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com
 [216.40.44.17])
	by kanga.kvack.org (Postfix) with ESMTP id 2E6CD6B0089
	for <linux-mm@kvack.org>; Fri, 18 Oct 2024 02:41:09 -0400 (EDT)
Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay07.hostedemail.com (Postfix) with ESMTP id A216716183F
	for <linux-mm@kvack.org>; Fri, 18 Oct 2024 06:40:55 +0000 (UTC)
X-FDA: 82685775648.11.106AB11
Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15])
	by imf24.hostedemail.com (Postfix) with ESMTP id 8AE8518000C
	for <linux-mm@kvack.org>; Fri, 18 Oct 2024 06:41:04 +0000 (UTC)
Authentication-Results: imf24.hostedemail.com;
	dkim=pass header.d=intel.com header.s=Intel header.b=nCEK20PB;
	dmarc=pass (policy=none) header.from=intel.com;
	spf=pass (imf24.hostedemail.com: domain of kanchana.p.sridhar@intel.com
 designates 192.198.163.15 as permitted sender)
 smtp.mailfrom=kanchana.p.sridhar@intel.com
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729233559; a=rsa-sha256;
	cv=none;
	b=qcxwqMd+VvjDRjDKiTkdxnGeYoeV4w6gT/oue46hwYOfqw5N3lmsCaxsJx1BmwdI3ELM34
	OJ0JxWy2KGKDSiejXCrsgwYJ1BYdtZ/FGYV7zQKc3DhPLLQTgHvopkWBDhVRCJgEa6GTyb
	zY/mylUUHL9qPuCd9iHCeQH+nBfE9vg=
ARC-Authentication-Results: i=1;
	imf24.hostedemail.com;
	dkim=pass header.d=intel.com header.s=Intel header.b=nCEK20PB;
	dmarc=pass (policy=none) header.from=intel.com;
	spf=pass (imf24.hostedemail.com: domain of kanchana.p.sridhar@intel.com
 designates 192.198.163.15 as permitted sender)
 smtp.mailfrom=kanchana.p.sridhar@intel.com
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed;
 d=hostedemail.com;
	s=arc-20220608; t=1729233559;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references:dkim-signature;
	bh=/Y/NkeDIxfDZF+CxUEAyYeR12VREGHI06dwspDGrOWw=;
	b=EDZLupipreoZJ8YVjzx59gWWqbSjTjkWbiC7mNqkhFwjV5g5cnuxt1tQrAIbQqL/KRh05s
	yR4cgyW3TocLuNcZZ6mcu67gAY2kj//l9d939IkH8nJ6rckdAp5PLCJQ1jGyn4b07yNs8b
	EBmlpe3h6ljm2enR2MuJgmZjH+2tP50=
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;
  d=intel.com; i=@intel.com; q=dns/txt; s=Intel;
  t=1729233667; x=1760769667;
  h=from:to:cc:subject:date:message-id:in-reply-to:
   references:mime-version:content-transfer-encoding;
  bh=+wwI3G60HC02QVcBH/92bWn6u+fpHdMtlQxgJjc/fNE=;
  b=nCEK20PB9HxSGk3NItTgalE+oRj2mRYiK8mbb7NM9IzkWnZ5j8W5Rjv2
   HjjiM97gYEFjbmBds6kKJ5grFG68aYDqaIWwcdO2FOBVClaz2alNhjEDt
   8ovi0fv8oqZZohV9XaxJd8TzuFL22aslvw3R4v4+llP/Xmronxs5+hJlM
   LOA7cHjg1pHnfnlS8atOmBn/b3qLIWc76VFQbXQoQhnj2WJbPzyYd8EoQ
   eTBoOD/qX9k7xVdOf2WqR9P+0Qqq9waFPhBWey60sv2TgreGttw7fu1/a
   EszPmFkHzyK9VzEntZG8cZf9KcKj8R4nzAuYvy2bRBMeri4KhUoX21/6i
   Q==;
X-CSE-ConnectionGUID: XuXtTK0uRfuDd10lDzfxDw==
X-CSE-MsgGUID: h6uAsDivSWSEQNxCPLotXA==
X-IronPort-AV: E=McAfee;i="6700,10204,11228"; a="28884812"
X-IronPort-AV: E=Sophos;i="6.11,212,1725346800";
   d="scan'208";a="28884812"
Received: from orviesa003.jf.intel.com ([10.64.159.143])
  by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 17 Oct 2024 23:41:02 -0700
X-CSE-ConnectionGUID: bd2/xxwGQ2WQ6ujF1GVEZA==
X-CSE-MsgGUID: luRQ3pv+QyWsohgkCEdd8A==
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="6.11,199,1725346800";
   d="scan'208";a="83607497"
Received: from jf5300-b11a338t.jf.intel.com ([10.242.51.6])
  by orviesa003.jf.intel.com with ESMTP; 17 Oct 2024 23:41:02 -0700
From: Kanchana P Sridhar <kanchana.p.sridhar@intel.com>
To: linux-kernel@vger.kernel.org,
	linux-mm@kvack.org,
	hannes@cmpxchg.org,
	yosryahmed@google.com,
	nphamcs@gmail.com,
	chengming.zhou@linux.dev,
	usamaarif642@gmail.com,
	ryan.roberts@arm.com,
	ying.huang@intel.com,
	21cnbao@gmail.com,
	akpm@linux-foundation.org,
	linux-crypto@vger.kernel.org,
	herbert@gondor.apana.org.au,
	davem@davemloft.net,
	clabbe@baylibre.com,
	ardb@kernel.org,
	ebiggers@google.com,
	surenb@google.com,
	kristen.c.accardi@intel.com,
	zanussi@kernel.org,
	viro@zeniv.linux.org.uk,
	brauner@kernel.org,
	jack@suse.cz,
	mcgrof@kernel.org,
	kees@kernel.org,
	joel.granados@kernel.org,
	bfoster@redhat.com,
	willy@infradead.org,
	linux-fsdevel@vger.kernel.org
Cc: wajdi.k.feghali@intel.com,
	vinodh.gopal@intel.com,
	kanchana.p.sridhar@intel.com
Subject: [RFC PATCH v1 03/13] crypto: testmgr - Add crypto testmgr acomp poll
 support.
Date: Thu, 17 Oct 2024 23:40:51 -0700
Message-Id: <20241018064101.336232-4-kanchana.p.sridhar@intel.com>
X-Mailer: git-send-email 2.27.0
In-Reply-To: <20241018064101.336232-1-kanchana.p.sridhar@intel.com>
References: <20241018064101.336232-1-kanchana.p.sridhar@intel.com>
MIME-Version: 1.0
X-Rspamd-Queue-Id: 8AE8518000C
X-Rspam-User: 
X-Rspamd-Server: rspam05
X-Stat-Signature: 9m7wjyfcds3eweeojnbtxdya6y9dwzqu
X-HE-Tag: 1729233664-270688
X-HE-Meta: 
 U2FsdGVkX19zCi+4x/RarSj+YZOyYR83NUqgYXDBfs+oirwIlol7t72m7+CyVlIz0u2/eUkF8DEVOIDszhWISjDFXowwWVJw16lMUnIOsvxHHR9E76nVfmPnKnfXRl4/7Pn27aroycttpUXNdGmKkn8ygN8I8+MjkU07DKhlhak5e/q7uL9d1M7QB7rN2n5SD2vIsOciL5ZGMxgBMP17w2dwg6sIYKDqV7r9rHzFfykKkKuStzTveGwHGRS/3B2s0gva0cKhkUVZ2MifpQ54XIkRxMZ23R8CkG8qRU2HpzETpiE761F5WOEAB42p4Mxy5SWKa6ZY5BqsEIhS+YuN2OeMsW3QFtGzhwuRiWZcWH03KtNq1PLYgLb0zELjsWEnLYncMnikeWuHTQSbaKwEcw4MZXIjlLLr2xzMzT1Pp4dZY1P0qmcIF7HHjSjJs0GgFUZttnQepxbOBQOnB8vTi4kYTz4CVDIIGMarPydh6B20fQfMAhVuERa0qpRd704ELyvXxd2chJVdKMpqQ1a67is+eafN8wT33x/44qeUPYM56kvhZWznAOlMlMiH1Nnlil3LUj1gb/pYBzQpJATU9lWQTKkB6X+FtdVRZEIpEtnY/3/tHPx0HADmF+RaI34Y+8iySuqYwsCulXf7H0dROHCynl3+cbXNIWVF1DlTARuFxHmeDxVt8JBGAT7rcqd2FdITsg5zg0WQNu84lIkEW/IUUZ+TsNPhRrODKENVBOyq8/P1ozH/GuJ8uWEAlfz3sIMQBjm/GqvaC5HPHS1NfS3vbuRm/St/eNsMEx9H7BIuYECvWqd1XxXc/IPFWIkaFMa934eodYTsfVew3jQc1Kg805i9krm0sVC8Q6Orn9lF/3JCTgrLLlKft3yMoo7y6k3x452ndwuyI33jW29gyn3vxENC7llsnnDN8u//BVuln5CFXun6H3LGM2+JBjjU/hbRy206RHuer/pNA+r
 T5vsuXW7
 OJSE/WTcaSJHfmReGFdtjdl0lPHmCZqSzYSe2MyeDMBWsVlegQDI2qNrFHcw2NaywHqgQUheGn0wQ/u4HYaaQ61/UWT6XOJct/an3Z6C/jTl0HyleNpSYFy15OCa24mCznCxPMMs2MvGjO9nqRT1eiCNGarlXTudut/1FgDtttbU0aHEze26Cki4Frg==
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>
List-Subscribe: <mailto:majordomo@kvack.org>
List-Unsubscribe: <mailto:majordomo@kvack.org>

This patch enables the newly added acomp poll API to be exercised in the
crypto test_acomp() calls to compress/decompress, if the acomp registers
a poll method.

Signed-off-by: Glover, Andre <andre.glover@intel.com>
Signed-off-by: Kanchana P Sridhar <kanchana.p.sridhar@intel.com>
---
 crypto/testmgr.c | 70 ++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 65 insertions(+), 5 deletions(-)

diff --git a/crypto/testmgr.c b/crypto/testmgr.c
index ee8da628e9da..54f6f59ae501 100644
--- a/crypto/testmgr.c
+++ b/crypto/testmgr.c
@@ -3482,7 +3482,19 @@ static int test_acomp(struct crypto_acomp *tfm,
 		acomp_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG,
 					   crypto_req_done, &wait);
 
-		ret = crypto_wait_req(crypto_acomp_compress(req), &wait);
+		if (tfm->poll) {
+			ret = crypto_acomp_compress(req);
+			if (ret == -EINPROGRESS) {
+				do {
+					ret = crypto_acomp_poll(req);
+					if (ret && ret != -EAGAIN)
+						break;
+				} while (ret);
+			}
+		} else {
+			ret = crypto_wait_req(crypto_acomp_compress(req), &wait);
+		}
+
 		if (ret) {
 			pr_err("alg: acomp: compression failed on test %d for %s: ret=%d\n",
 			       i + 1, algo, -ret);
@@ -3498,7 +3510,19 @@ static int test_acomp(struct crypto_acomp *tfm,
 		crypto_init_wait(&wait);
 		acomp_request_set_params(req, &src, &dst, ilen, dlen);
 
-		ret = crypto_wait_req(crypto_acomp_decompress(req), &wait);
+		if (tfm->poll) {
+			ret = crypto_acomp_decompress(req);
+			if (ret == -EINPROGRESS) {
+				do {
+					ret = crypto_acomp_poll(req);
+					if (ret && ret != -EAGAIN)
+						break;
+				} while (ret);
+			}
+		} else {
+			ret = crypto_wait_req(crypto_acomp_decompress(req), &wait);
+		}
+
 		if (ret) {
 			pr_err("alg: acomp: compression failed on test %d for %s: ret=%d\n",
 			       i + 1, algo, -ret);
@@ -3531,7 +3555,19 @@ static int test_acomp(struct crypto_acomp *tfm,
 		sg_init_one(&src, input_vec, ilen);
 		acomp_request_set_params(req, &src, NULL, ilen, 0);
 
-		ret = crypto_wait_req(crypto_acomp_compress(req), &wait);
+		if (tfm->poll) {
+			ret = crypto_acomp_compress(req);
+			if (ret == -EINPROGRESS) {
+				do {
+					ret = crypto_acomp_poll(req);
+					if (ret && ret != -EAGAIN)
+						break;
+				} while (ret);
+			}
+		} else {
+			ret = crypto_wait_req(crypto_acomp_compress(req), &wait);
+		}
+
 		if (ret) {
 			pr_err("alg: acomp: compression failed on NULL dst buffer test %d for %s: ret=%d\n",
 			       i + 1, algo, -ret);
@@ -3574,7 +3610,19 @@ static int test_acomp(struct crypto_acomp *tfm,
 		acomp_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG,
 					   crypto_req_done, &wait);
 
-		ret = crypto_wait_req(crypto_acomp_decompress(req), &wait);
+		if (tfm->poll) {
+			ret = crypto_acomp_decompress(req);
+			if (ret == -EINPROGRESS) {
+				do {
+					ret = crypto_acomp_poll(req);
+					if (ret && ret != -EAGAIN)
+						break;
+				} while (ret);
+			}
+		} else {
+			ret = crypto_wait_req(crypto_acomp_decompress(req), &wait);
+		}
+
 		if (ret) {
 			pr_err("alg: acomp: decompression failed on test %d for %s: ret=%d\n",
 			       i + 1, algo, -ret);
@@ -3606,7 +3654,19 @@ static int test_acomp(struct crypto_acomp *tfm,
 		crypto_init_wait(&wait);
 		acomp_request_set_params(req, &src, NULL, ilen, 0);
 
-		ret = crypto_wait_req(crypto_acomp_decompress(req), &wait);
+		if (tfm->poll) {
+			ret = crypto_acomp_decompress(req);
+			if (ret == -EINPROGRESS) {
+				do {
+					ret = crypto_acomp_poll(req);
+					if (ret && ret != -EAGAIN)
+						break;
+				} while (ret);
+			}
+		} else {
+			ret = crypto_wait_req(crypto_acomp_decompress(req), &wait);
+		}
+
 		if (ret) {
 			pr_err("alg: acomp: decompression failed on NULL dst buffer test %d for %s: ret=%d\n",
 			       i + 1, algo, -ret);

From patchwork Fri Oct 18 06:40:52 2024
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: "Sridhar, Kanchana P" <kanchana.p.sridhar@intel.com>
X-Patchwork-Id: 13841230
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 31AC8D3C54C
	for <linux-mm@archiver.kernel.org>; Fri, 18 Oct 2024 06:41:13 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id 6DA836B0089; Fri, 18 Oct 2024 02:41:09 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 6AD746B008A; Fri, 18 Oct 2024 02:41:09 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 4B2606B0093; Fri, 18 Oct 2024 02:41:09 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com
 [216.40.44.14])
	by kanga.kvack.org (Postfix) with ESMTP id 2EDF16B008A
	for <linux-mm@kvack.org>; Fri, 18 Oct 2024 02:41:09 -0400 (EDT)
Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay09.hostedemail.com (Postfix) with ESMTP id B01B8811E5
	for <linux-mm@kvack.org>; Fri, 18 Oct 2024 06:40:58 +0000 (UTC)
X-FDA: 82685775732.14.D1687B2
Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15])
	by imf27.hostedemail.com (Postfix) with ESMTP id 0283A40005
	for <linux-mm@kvack.org>; Fri, 18 Oct 2024 06:40:55 +0000 (UTC)
Authentication-Results: imf27.hostedemail.com;
	dkim=pass header.d=intel.com header.s=Intel header.b=Mb2gvK0R;
	dmarc=pass (policy=none) header.from=intel.com;
	spf=pass (imf27.hostedemail.com: domain of kanchana.p.sridhar@intel.com
 designates 192.198.163.15 as permitted sender)
 smtp.mailfrom=kanchana.p.sridhar@intel.com
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729233548; a=rsa-sha256;
	cv=none;
	b=Q0JL4qtjNPRW4VQnZa3Z0vO4B0pRNNQeFB4D9wRD932gEHBE4NfmOoal3l9awvkgV50pHn
	wIofMXjsO0WIGlz2jmdNTFDlWNRmnc9YW6aH3XXge4xGbJ53wW7M+HyymZspCipPNonFc6
	dQWVL87QdsLFTgNUhmV4wUztRpj2deE=
ARC-Authentication-Results: i=1;
	imf27.hostedemail.com;
	dkim=pass header.d=intel.com header.s=Intel header.b=Mb2gvK0R;
	dmarc=pass (policy=none) header.from=intel.com;
	spf=pass (imf27.hostedemail.com: domain of kanchana.p.sridhar@intel.com
 designates 192.198.163.15 as permitted sender)
 smtp.mailfrom=kanchana.p.sridhar@intel.com
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed;
 d=hostedemail.com;
	s=arc-20220608; t=1729233548;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references:dkim-signature;
	bh=ojZCB3+cDbZulKZCRnF9Fc4yCfp7jxoYD47wlxBQ5kU=;
	b=6bzDlKDodjnMvtxFOQwoRcqPyIdIBA9m1X97O3LdXr/tOwWNHIxZOGoJh1Y+z/QXVt8Y8H
	pnypb1ElKC6tV1w/seIZWFT+G4Gdnxy0efdFTVK8pl89l/Tc7XRaKQcRnOcAKby0B4Q86X
	Py14bCzZGvxg/8n0aE8l2gJMDZTDCgM=
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;
  d=intel.com; i=@intel.com; q=dns/txt; s=Intel;
  t=1729233667; x=1760769667;
  h=from:to:cc:subject:date:message-id:in-reply-to:
   references:mime-version:content-transfer-encoding;
  bh=gtInCloZyEBkd8uVBGAQIApykz7xeQMHHy5TXu53KaI=;
  b=Mb2gvK0RjAvMkQzGeLKoxIdurn4Db89JuGnLPssvKRCqJ4G/CHCdzNO0
   n3/e8j+WgZ+wZ84KfOpvrh5kOjUc0Y41AN6a+cqOjebGWOhgvUxWwy3IT
   doAxYjqOHZqoXdTO7jW+yEFHhDyRS3T9uxdQzoA+rIzKxK5YUexH6S/xa
   87GiE/daa3E1N5IZNjVyLB+Aop3M0M8BNu7nAWGivv4rEnfDKFtTKCIkb
   2ED2ggUp5gt36rjjusvuKqfRRUTMybm9N28jRcjtidzMjt1iocbfxh4hQ
   squ3/9Jl/BICov7srpf8I/M9I+U1sdwRhDX4+fjaZ706I4cwPQXCM1G4k
   w==;
X-CSE-ConnectionGUID: +WOUhZApQjy8hY9uDhw6eQ==
X-CSE-MsgGUID: 9rR1rEHxQ2ymNtP4LlfqWw==
X-IronPort-AV: E=McAfee;i="6700,10204,11228"; a="28884827"
X-IronPort-AV: E=Sophos;i="6.11,212,1725346800";
   d="scan'208";a="28884827"
Received: from orviesa003.jf.intel.com ([10.64.159.143])
  by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 17 Oct 2024 23:41:02 -0700
X-CSE-ConnectionGUID: S5P7Z3aMRMezS5ESvvkKGg==
X-CSE-MsgGUID: 95MDyyYsR9OFLjtcK6GVSw==
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="6.11,199,1725346800";
   d="scan'208";a="83607500"
Received: from jf5300-b11a338t.jf.intel.com ([10.242.51.6])
  by orviesa003.jf.intel.com with ESMTP; 17 Oct 2024 23:41:02 -0700
From: Kanchana P Sridhar <kanchana.p.sridhar@intel.com>
To: linux-kernel@vger.kernel.org,
	linux-mm@kvack.org,
	hannes@cmpxchg.org,
	yosryahmed@google.com,
	nphamcs@gmail.com,
	chengming.zhou@linux.dev,
	usamaarif642@gmail.com,
	ryan.roberts@arm.com,
	ying.huang@intel.com,
	21cnbao@gmail.com,
	akpm@linux-foundation.org,
	linux-crypto@vger.kernel.org,
	herbert@gondor.apana.org.au,
	davem@davemloft.net,
	clabbe@baylibre.com,
	ardb@kernel.org,
	ebiggers@google.com,
	surenb@google.com,
	kristen.c.accardi@intel.com,
	zanussi@kernel.org,
	viro@zeniv.linux.org.uk,
	brauner@kernel.org,
	jack@suse.cz,
	mcgrof@kernel.org,
	kees@kernel.org,
	joel.granados@kernel.org,
	bfoster@redhat.com,
	willy@infradead.org,
	linux-fsdevel@vger.kernel.org
Cc: wajdi.k.feghali@intel.com,
	vinodh.gopal@intel.com,
	kanchana.p.sridhar@intel.com
Subject: [RFC PATCH v1 04/13] mm: zswap: zswap_compress()/decompress() can
 submit, then poll an acomp_req.
Date: Thu, 17 Oct 2024 23:40:52 -0700
Message-Id: <20241018064101.336232-5-kanchana.p.sridhar@intel.com>
X-Mailer: git-send-email 2.27.0
In-Reply-To: <20241018064101.336232-1-kanchana.p.sridhar@intel.com>
References: <20241018064101.336232-1-kanchana.p.sridhar@intel.com>
MIME-Version: 1.0
X-Rspamd-Server: rspam12
X-Rspamd-Queue-Id: 0283A40005
X-Stat-Signature: 8cr9h6uuntr3ruutah5zkuirdji6y8x1
X-Rspam-User: 
X-HE-Tag: 1729233655-48885
X-HE-Meta: 
 U2FsdGVkX1+1+LpEm9Zq0T9bRmN3gvvEDhkmhUlNpOepcrDfJaL7FfIXaEOAlvO7XgYXDt9vV8YEiRvqqQ4VH/jPtjLrilsTWfLC1Qc8NiFgTDSWNHyt/HL81H2Y4UjGzUzkS34gQ7bFZc9uL4/2wOk8Gq6aq8T7CnoY7iaSqcw6pGPjn+Izf8KapxJ9tSZXj3b7deHfU0Qu5XfIWF6tO7zoxYBmCgyssd8WWv/bkidR4wtT3Y1QfpTtivE2XeGdQxKxVd/bV2ySuUq7Sw99CyyRvAedRGq1S/yrPT+P8RYvSLrCusdIaMrDpp8827T+de1a1DbaaJd0RblU1qt9ghQPKG6n3X8Ri+e4JBXjJ5k5JWXZg7BjsKKSPjVFm2YD9cxWe6kMFjmL+h2Dk5YaRqk87gU9WITYBOWWC5xffLfcZ15SzV0FBnhDnkZKT8uWHYQ2069NILxVKKG9nwo4Oy0ETebzpXTuTxJg0Ub+YcNmYLFftx8tKSlPr2+/3v6Gv8o7wk+m2KjzSQFMPFkf8QSABhDuXtFdO6YPagagthYp/ykr/IUutum+PQUf8zHuYqx5m3Uo83RZBMsd51qmz2StQkxuPIuNLXMLhpMjLUy2Yhqy6os9iXef1VjtnyWuBsn25+0savOhNbCJvhULO9hNgYzP3GzNmVsEDacEYgphWndErqUyiF2NAUREX0ScwJRnas85xr/2TeAFZcRa4DoWB+XY4QAQk+b7I2tbDOQuZ8dnsK1IoavH9eA80KCglpHjoha+3+waNJrbvVdpRdBSB0Nxmf3jS2bTvtHsvZFBe7J5qC/dfX3TUv9hJ+AHIT8PXHsDjMOwHkdXcH/3mMklKb7iL0aG2n7ILa3o+mb+1ei3y0NHBxWGhGwn72KmorT03NeoFDc1DYR4P70poPcFtPlmsqlkXLOV9VC34D9DH1QikZsLvoIQnpWcWGyXQvy64Ccenu22V/JgpZk
 n0T8bG5s
 c/uzZjBCos62RBRvN7lT6zjoxGjFbz7wirJEFV/dW+n/nCS5P/pyKkinabKOYYd2t25FJF3JijQtVIlZA6pFUdJoz8rqAtfDmnoJMqRB8QJGjwGGtgT0dHY2WUcBWH2QQhjAq+Hhq6o2/Wlw4XBLzSmTCzRaHcCF3S/LxL47Mm+BMtpXXm2yXlfOWVAg55C1KnclYEYIklM7vOx3EEzXBMnmokVkEpIrQPHh4F4T5yJH8du0=
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>
List-Subscribe: <mailto:majordomo@kvack.org>
List-Unsubscribe: <mailto:majordomo@kvack.org>

If the crypto_acomp has a poll interface registered, zswap_compress()
and zswap_decompress() will submit the acomp_req, and then poll() for a
successful completion/error status in a busy-wait loop. This allows an
asynchronous way to manage (potentially multiple) acomp_reqs without
the use of interrupts, which is supported in the iaa_crypto driver.

This enables us to implement batch submission of multiple
compression/decompression jobs to the Intel IAA hardware accelerator,
which will process them in parallel; followed by polling the batch's
acomp_reqs for completion status.

Signed-off-by: Kanchana P Sridhar <kanchana.p.sridhar@intel.com>
---
 mm/zswap.c | 51 +++++++++++++++++++++++++++++++++++++++------------
 1 file changed, 39 insertions(+), 12 deletions(-)

diff --git a/mm/zswap.c b/mm/zswap.c
index f6316b66fb23..948c9745ee57 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -910,18 +910,34 @@ static bool zswap_compress(struct page *page, struct zswap_entry *entry,
 	acomp_request_set_params(acomp_ctx->req, &input, &output, PAGE_SIZE, dlen);
 
 	/*
-	 * it maybe looks a little bit silly that we send an asynchronous request,
-	 * then wait for its completion synchronously. This makes the process look
-	 * synchronous in fact.
-	 * Theoretically, acomp supports users send multiple acomp requests in one
-	 * acomp instance, then get those requests done simultaneously. but in this
-	 * case, zswap actually does store and load page by page, there is no
-	 * existing method to send the second page before the first page is done
-	 * in one thread doing zwap.
-	 * but in different threads running on different cpu, we have different
-	 * acomp instance, so multiple threads can do (de)compression in parallel.
+	 * If the crypto_acomp provides an asynchronous poll() interface,
+	 * submit the descriptor and poll for a completion status.
+	 *
+	 * It maybe looks a little bit silly that we send an asynchronous
+	 * request, then wait for its completion in a busy-wait poll loop, or,
+	 * synchronously. This makes the process look synchronous in fact.
+	 * Theoretically, acomp supports users send multiple acomp requests in
+	 * one acomp instance, then get those requests done simultaneously.
+	 * But in this case, zswap actually does store and load page by page,
+	 * there is no existing method to send the second page before the
+	 * first page is done in one thread doing zswap.
+	 * But in different threads running on different cpu, we have different
+	 * acomp instance, so multiple threads can do (de)compression in
+	 * parallel.
 	 */
-	comp_ret = crypto_wait_req(crypto_acomp_compress(acomp_ctx->req), &acomp_ctx->wait);
+	if (acomp_ctx->acomp->poll) {
+		comp_ret = crypto_acomp_compress(acomp_ctx->req);
+		if (comp_ret == -EINPROGRESS) {
+			do {
+				comp_ret = crypto_acomp_poll(acomp_ctx->req);
+				if (comp_ret && comp_ret != -EAGAIN)
+					break;
+			} while (comp_ret);
+		}
+	} else {
+		comp_ret = crypto_wait_req(crypto_acomp_compress(acomp_ctx->req), &acomp_ctx->wait);
+	}
+
 	dlen = acomp_ctx->req->dlen;
 	if (comp_ret)
 		goto unlock;
@@ -959,6 +975,7 @@ static void zswap_decompress(struct zswap_entry *entry, struct folio *folio)
 	struct scatterlist input, output;
 	struct crypto_acomp_ctx *acomp_ctx;
 	u8 *src;
+	int ret;
 
 	acomp_ctx = raw_cpu_ptr(entry->pool->acomp_ctx);
 	mutex_lock(&acomp_ctx->mutex);
@@ -984,7 +1001,17 @@ static void zswap_decompress(struct zswap_entry *entry, struct folio *folio)
 	sg_init_table(&output, 1);
 	sg_set_folio(&output, folio, PAGE_SIZE, 0);
 	acomp_request_set_params(acomp_ctx->req, &input, &output, entry->length, PAGE_SIZE);
-	BUG_ON(crypto_wait_req(crypto_acomp_decompress(acomp_ctx->req), &acomp_ctx->wait));
+	if (acomp_ctx->acomp->poll) {
+		ret = crypto_acomp_decompress(acomp_ctx->req);
+		if (ret == -EINPROGRESS) {
+			do {
+				ret = crypto_acomp_poll(acomp_ctx->req);
+				BUG_ON(ret && ret != -EAGAIN);
+			} while (ret);
+		}
+	} else {
+		BUG_ON(crypto_wait_req(crypto_acomp_decompress(acomp_ctx->req), &acomp_ctx->wait));
+	}
 	BUG_ON(acomp_ctx->req->dlen != PAGE_SIZE);
 	mutex_unlock(&acomp_ctx->mutex);
 

From patchwork Fri Oct 18 06:40:53 2024
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: "Sridhar, Kanchana P" <kanchana.p.sridhar@intel.com>
X-Patchwork-Id: 13841232
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id C86EFD3C550
	for <linux-mm@archiver.kernel.org>; Fri, 18 Oct 2024 06:41:17 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id A81676B0093; Fri, 18 Oct 2024 02:41:10 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id A31AC6B0095; Fri, 18 Oct 2024 02:41:10 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 85D156B0096; Fri, 18 Oct 2024 02:41:10 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com
 [216.40.44.11])
	by kanga.kvack.org (Postfix) with ESMTP id 634B86B0093
	for <linux-mm@kvack.org>; Fri, 18 Oct 2024 02:41:10 -0400 (EDT)
Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay06.hostedemail.com (Postfix) with ESMTP id D12F0AB378
	for <linux-mm@kvack.org>; Fri, 18 Oct 2024 06:40:46 +0000 (UTC)
X-FDA: 82685775690.11.5817867
Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15])
	by imf19.hostedemail.com (Postfix) with ESMTP id AA9041A000C
	for <linux-mm@kvack.org>; Fri, 18 Oct 2024 06:40:54 +0000 (UTC)
Authentication-Results: imf19.hostedemail.com;
	dkim=pass header.d=intel.com header.s=Intel header.b=R2xeUpSa;
	spf=pass (imf19.hostedemail.com: domain of kanchana.p.sridhar@intel.com
 designates 192.198.163.15 as permitted sender)
 smtp.mailfrom=kanchana.p.sridhar@intel.com;
	dmarc=pass (policy=none) header.from=intel.com
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed;
 d=hostedemail.com;
	s=arc-20220608; t=1729233521;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references:dkim-signature;
	bh=Bluf7mu5dGh0ItEEG/jNQhxGtfJbo3k5gaIfftwacYA=;
	b=KKXUo/lVQUHdYQUXbbJbKApm9dqREO/bJaKHY+EETbQJTxmCOtQe/KYnW1AR3fSWpRBhWi
	ub9Dq16dBIX18b7KFms132dFIwFRISFB2ukoRr64VKcQdoFH/OCSL0h+X12+h3BrFg+Zs0
	E5mRL0r0JcaLWj92SNuGny0dhpgmVAM=
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729233521; a=rsa-sha256;
	cv=none;
	b=MHdJJX6h4tq/2nj7/NTY7MSJojCO0ww4pRmFv9BBg2FyfWQGrI0RZgUSchceeXa53wv5fj
	DfRMMQLDFc5Qy/fKObvdOKHuf6kSeNsVdKBfoRnvo0uLIfiZ6rEAhgW6U+zcRJ6EM4jloZ
	uHax5EV4BUMlJxIJ4kWFvG+11MNSLqA=
ARC-Authentication-Results: i=1;
	imf19.hostedemail.com;
	dkim=pass header.d=intel.com header.s=Intel header.b=R2xeUpSa;
	spf=pass (imf19.hostedemail.com: domain of kanchana.p.sridhar@intel.com
 designates 192.198.163.15 as permitted sender)
 smtp.mailfrom=kanchana.p.sridhar@intel.com;
	dmarc=pass (policy=none) header.from=intel.com
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;
  d=intel.com; i=@intel.com; q=dns/txt; s=Intel;
  t=1729233668; x=1760769668;
  h=from:to:cc:subject:date:message-id:in-reply-to:
   references:mime-version:content-transfer-encoding;
  bh=Z1uLVHFrCxBVRppWSz2uPPFIJpDocNdlo79Vr4aRt6E=;
  b=R2xeUpSaRiT95YqgpF2+YmZOCC65FXhX+FMgh1qP/DqOUcosZ8AH+zC0
   fzBTcfITIT0DQMe9YjB/B1eBUN+sWCKREZhSPVe8e026VvpY+TCx/A0q3
   8Fxqqm2kZfqhw/C3L6SJnnATLX1IHyLqrB1m0oK7QG8gGhDwCwcGqY3TE
   lSN5wVwwZUm6loK+KXXmRj6MCxCmtcgwzeADGcwOoaYjTX+0G/vaRpS9+
   yfy04aDBiCc/8aPkyX2DWA5BgKUR5y0S/4H/eYYU5INmYgX/MZqSM0pXA
   GRklOQLjXJ0007Yf415PETBomJx9om3KVyapf7x7YxapPZM0TtX9AoRDs
   A==;
X-CSE-ConnectionGUID: YWqC/V7kS9OyscpYVVIaKA==
X-CSE-MsgGUID: J6F5f+2ARYqZ3uBkkRtwwg==
X-IronPort-AV: E=McAfee;i="6700,10204,11228"; a="28884842"
X-IronPort-AV: E=Sophos;i="6.11,212,1725346800";
   d="scan'208";a="28884842"
Received: from orviesa003.jf.intel.com ([10.64.159.143])
  by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 17 Oct 2024 23:41:02 -0700
X-CSE-ConnectionGUID: BcOBrWNBQuGJzmoGtHi9jQ==
X-CSE-MsgGUID: tf9TGtY1SRSPfpX2G7ugmw==
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="6.11,199,1725346800";
   d="scan'208";a="83607504"
Received: from jf5300-b11a338t.jf.intel.com ([10.242.51.6])
  by orviesa003.jf.intel.com with ESMTP; 17 Oct 2024 23:41:02 -0700
From: Kanchana P Sridhar <kanchana.p.sridhar@intel.com>
To: linux-kernel@vger.kernel.org,
	linux-mm@kvack.org,
	hannes@cmpxchg.org,
	yosryahmed@google.com,
	nphamcs@gmail.com,
	chengming.zhou@linux.dev,
	usamaarif642@gmail.com,
	ryan.roberts@arm.com,
	ying.huang@intel.com,
	21cnbao@gmail.com,
	akpm@linux-foundation.org,
	linux-crypto@vger.kernel.org,
	herbert@gondor.apana.org.au,
	davem@davemloft.net,
	clabbe@baylibre.com,
	ardb@kernel.org,
	ebiggers@google.com,
	surenb@google.com,
	kristen.c.accardi@intel.com,
	zanussi@kernel.org,
	viro@zeniv.linux.org.uk,
	brauner@kernel.org,
	jack@suse.cz,
	mcgrof@kernel.org,
	kees@kernel.org,
	joel.granados@kernel.org,
	bfoster@redhat.com,
	willy@infradead.org,
	linux-fsdevel@vger.kernel.org
Cc: wajdi.k.feghali@intel.com,
	vinodh.gopal@intel.com,
	kanchana.p.sridhar@intel.com
Subject: [RFC PATCH v1 05/13] crypto: iaa - Make async mode the default.
Date: Thu, 17 Oct 2024 23:40:53 -0700
Message-Id: <20241018064101.336232-6-kanchana.p.sridhar@intel.com>
X-Mailer: git-send-email 2.27.0
In-Reply-To: <20241018064101.336232-1-kanchana.p.sridhar@intel.com>
References: <20241018064101.336232-1-kanchana.p.sridhar@intel.com>
MIME-Version: 1.0
X-Stat-Signature: ir5xmrdzfb49kgeyjuhiwyruqjryjzbj
X-Rspamd-Queue-Id: AA9041A000C
X-Rspam-User: 
X-Rspamd-Server: rspam08
X-HE-Tag: 1729233654-142561
X-HE-Meta: 
 U2FsdGVkX1/WgUvsIzGhiTuByK6q5a8gtLsZnx/rV5JqgMSwyDWbgW+HD+CHEX2BT/1cBFZAFF3+Eb+vJfdlP5YkwoKIBFjZNZyyum0rjr0M8/kIHjr5IvbnlpghPjJkw7xMBJzuJ0+ciLK2B6R7cfXfx8PoMRJS2ALH444Mzmyem0pmKqNORVppgnMmdvfDXRvjqmegKqwSuOiKxX5kfa6arrpdoUw3xXqbni30dpFfuQnBamHscVLFPr/ML9TBIfFFuuIhMa6S3hFmuAa3sg+Pocyg7pDbJ4DCKEzHVorhc4wClXZnQjJcykuArnh4XuLaW8hewiQSY1xPR4jJCuAz/DLgJzOVL4tM5BPnVZVWfSkdHktAw3rQJ0lbnQzsYU8h0VrBMXJHy5QM2QsX+WBy2jvQ4zQdc2iodmZ+OVWt2aq4ZLJ0V3qWU9Jb6VlqZqHvMayGS3n0gCpHY5J65OTzaQXBOWtTgc0agOx0IizFzG2Lncmubq39xdHooBQ2CxPZc7rX5dJUirmdt0TkoWk9JqWMB0+oFbkOyoBdqENFkCLQ8AC2Mi5c0gbFZ4mvssvo8q9yNRQO/RHFH9WF6a2aeqo6by7328CcfJixsSNloktB3fDoKo6TKSbZPFBdZDX4zLd2MjUjhRdqYThi/FQsiYbGjJNHY6jWU/o/mIdfBtTGFdMKhsRtuCDfFGi0dnf/oAOexLcPB/qkOW1J2KiIf+Lw2GYMnmGc/gIe3sEI+WnbHUO03O51Qihaf0Dn8Jw6kuNP8WJhux9Be4p/7WFKELL3RWAnWaZdA6j5aFXTpN+xlYt6PaKfcg6imw9Pol5e1Pj8CBbNnEkruALMG2EjHYYsh+BMEXfKGl7EhHPfwErFZTpfkQ43O1gT0XW7LMfZxuBkkrJucL1MXLr9SLLIY9IG7bikgIgleAHhwM1p57MLgBcDK0zAH/uxe/gDnTt4qP1nXwWizfTG/qb
 rpUS2OlU
 GriNXN+8SRCiO0KpA9Ido1Ad/BT6nN876Hh4mZTaOXmYt2QFR+gv2u2g7s2PdEbkacyKnu32UTclmBuXahnSlVcG2DXUrtPnGckHkkxz3gGPg70XtVo/potpVkCRz5DVukOcxR/PDtqSupSPg39mDnVR7546laC3W8t8jtJt2yf0avCq6ZPIPZ5ruJw==
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>
List-Subscribe: <mailto:majordomo@kvack.org>
List-Unsubscribe: <mailto:majordomo@kvack.org>

This patch makes it easier for IAA hardware acceleration in the iaa_crypto
driver to be loaded by default in the most efficient/recommended "async"
mode, namely, asynchronous submission of descriptors, followed by polling
for job completions. Earlier, the "sync" mode used to be the default.

This way, anyone that wants to use IAA can do so after building the kernel,
and *without* having to go through these steps to use async poll:

  1) disable all the IAA device/wq bindings that happen at boot time
  2) rmmod iaa_crypto
  3) modprobe iaa_crypto
  4) echo async > /sys/bus/dsa/drivers/crypto/sync_mode
  5) re-run initialization of the IAA devices and wqs

Signed-off-by: Kanchana P Sridhar <kanchana.p.sridhar@intel.com>
---
 drivers/crypto/intel/iaa/iaa_crypto_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/crypto/intel/iaa/iaa_crypto_main.c b/drivers/crypto/intel/iaa/iaa_crypto_main.c
index 6a8577ac1330..6c262b1eb09d 100644
--- a/drivers/crypto/intel/iaa/iaa_crypto_main.c
+++ b/drivers/crypto/intel/iaa/iaa_crypto_main.c
@@ -153,7 +153,7 @@ static DRIVER_ATTR_RW(verify_compress);
  */
 
 /* Use async mode */
-static bool async_mode;
+static bool async_mode = true;
 /* Use interrupts */
 static bool use_irq;
 

From patchwork Fri Oct 18 06:40:54 2024
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: "Sridhar, Kanchana P" <kanchana.p.sridhar@intel.com>
X-Patchwork-Id: 13841233
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 191DBD3C54C
	for <linux-mm@archiver.kernel.org>; Fri, 18 Oct 2024 06:41:20 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id 839966B009A; Fri, 18 Oct 2024 02:41:11 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 7E8306B0098; Fri, 18 Oct 2024 02:41:11 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 63B986B0099; Fri, 18 Oct 2024 02:41:11 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com
 [216.40.44.14])
	by kanga.kvack.org (Postfix) with ESMTP id 3FA5F6B0095
	for <linux-mm@kvack.org>; Fri, 18 Oct 2024 02:41:11 -0400 (EDT)
Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay10.hostedemail.com (Postfix) with ESMTP id 6171FC04D7
	for <linux-mm@kvack.org>; Fri, 18 Oct 2024 06:40:58 +0000 (UTC)
X-FDA: 82685775648.05.BA8243A
Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15])
	by imf27.hostedemail.com (Postfix) with ESMTP id 3205140005
	for <linux-mm@kvack.org>; Fri, 18 Oct 2024 06:40:58 +0000 (UTC)
Authentication-Results: imf27.hostedemail.com;
	dkim=pass header.d=intel.com header.s=Intel header.b=QZKRiaER;
	dmarc=pass (policy=none) header.from=intel.com;
	spf=pass (imf27.hostedemail.com: domain of kanchana.p.sridhar@intel.com
 designates 192.198.163.15 as permitted sender)
 smtp.mailfrom=kanchana.p.sridhar@intel.com
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729233550; a=rsa-sha256;
	cv=none;
	b=iyfTkLJU/Xw30pS7AbVpUKWp9UuJW4dbHTa0qisVzlCoUHGQTfyyHYd4Cx4PLa264F8j0z
	bAE9y+ZBv6/E9BXoufkeeGP4OAU/nL5G5SjRMdVHDKBkdj+2vjhvujA3sEXQtEhh0eS7u0
	n3ChC9xcKZt5vVvm273WKw0FYHVB4VQ=
ARC-Authentication-Results: i=1;
	imf27.hostedemail.com;
	dkim=pass header.d=intel.com header.s=Intel header.b=QZKRiaER;
	dmarc=pass (policy=none) header.from=intel.com;
	spf=pass (imf27.hostedemail.com: domain of kanchana.p.sridhar@intel.com
 designates 192.198.163.15 as permitted sender)
 smtp.mailfrom=kanchana.p.sridhar@intel.com
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed;
 d=hostedemail.com;
	s=arc-20220608; t=1729233550;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references:dkim-signature;
	bh=qifxFL7WKgPi3n67HmjuZ7i0DYtwFJEouMFmP8+eMD8=;
	b=bz2fWCeaUvkbrfpD/kk4yRkJ3qwgjiHulo0RPizDSLeRjq7QbOLjGMvWNVcPm5hYMbekHY
	LJUJUasGRNyLIdIA+5+pmuBCdYZAvsapLaG+vuKFAlFlawNofu5Q5nOEezi9aRVX5HKeVE
	QnQUMMz0yDXU7FRGxLdYw0SMcCQnEUo=
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;
  d=intel.com; i=@intel.com; q=dns/txt; s=Intel;
  t=1729233669; x=1760769669;
  h=from:to:cc:subject:date:message-id:in-reply-to:
   references:mime-version:content-transfer-encoding;
  bh=EYVB88p4llixVMUciVyNq2hbHLQAzfv8NwXmpPaNhRE=;
  b=QZKRiaERvUjKnC9HV/Z+9VIOjzmR0YXr+Q8RULVj+yaKs5SkisqsxjRi
   LsXHcjcSQSj0ZM/HSU96vqg7RzbG9U/rG6oBgIy4G1xIF3UBAV/Ja+3sg
   xPYrsKm7J05tL5prYbhyx+MB1ryBcO/6GE43JUIYmZHjyyvCwENunTmFX
   v41dI4otXILyleGdJmVQ8P30cXNT8labu1Yhc5bzNy7hByouTpTnrMR4T
   71FTh7dICGzSW38+TeQ60X9rt0NOcZa+S+qxc+kYzpH95iH/ciz8DwZ5T
   QGmYOaVtOSqYkazSR77MSZqEaALq/rr+cmSk9sk9tQIUUPWqHZ8Iz7TJg
   w==;
X-CSE-ConnectionGUID: 1FT/YBJORnWS8XEIvRqQLw==
X-CSE-MsgGUID: sVWdXnb6RuaFbtyvzgFQBQ==
X-IronPort-AV: E=McAfee;i="6700,10204,11228"; a="28884861"
X-IronPort-AV: E=Sophos;i="6.11,212,1725346800";
   d="scan'208";a="28884861"
Received: from orviesa003.jf.intel.com ([10.64.159.143])
  by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 17 Oct 2024 23:41:03 -0700
X-CSE-ConnectionGUID: lhuPBnn6Sp2tniOZwmYnLg==
X-CSE-MsgGUID: RubBN4GeQEObYR11REdZpA==
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="6.11,199,1725346800";
   d="scan'208";a="83607509"
Received: from jf5300-b11a338t.jf.intel.com ([10.242.51.6])
  by orviesa003.jf.intel.com with ESMTP; 17 Oct 2024 23:41:03 -0700
From: Kanchana P Sridhar <kanchana.p.sridhar@intel.com>
To: linux-kernel@vger.kernel.org,
	linux-mm@kvack.org,
	hannes@cmpxchg.org,
	yosryahmed@google.com,
	nphamcs@gmail.com,
	chengming.zhou@linux.dev,
	usamaarif642@gmail.com,
	ryan.roberts@arm.com,
	ying.huang@intel.com,
	21cnbao@gmail.com,
	akpm@linux-foundation.org,
	linux-crypto@vger.kernel.org,
	herbert@gondor.apana.org.au,
	davem@davemloft.net,
	clabbe@baylibre.com,
	ardb@kernel.org,
	ebiggers@google.com,
	surenb@google.com,
	kristen.c.accardi@intel.com,
	zanussi@kernel.org,
	viro@zeniv.linux.org.uk,
	brauner@kernel.org,
	jack@suse.cz,
	mcgrof@kernel.org,
	kees@kernel.org,
	joel.granados@kernel.org,
	bfoster@redhat.com,
	willy@infradead.org,
	linux-fsdevel@vger.kernel.org
Cc: wajdi.k.feghali@intel.com,
	vinodh.gopal@intel.com,
	kanchana.p.sridhar@intel.com
Subject: [RFC PATCH v1 06/13] crypto: iaa - Disable iaa_verify_compress by
 default.
Date: Thu, 17 Oct 2024 23:40:54 -0700
Message-Id: <20241018064101.336232-7-kanchana.p.sridhar@intel.com>
X-Mailer: git-send-email 2.27.0
In-Reply-To: <20241018064101.336232-1-kanchana.p.sridhar@intel.com>
References: <20241018064101.336232-1-kanchana.p.sridhar@intel.com>
MIME-Version: 1.0
X-Rspamd-Server: rspam12
X-Rspamd-Queue-Id: 3205140005
X-Stat-Signature: kfm83dicous8mkfcgs1qpi9w7bxo9mod
X-Rspam-User: 
X-HE-Tag: 1729233658-67841
X-HE-Meta: 
 U2FsdGVkX19X9QhfIIRzuukC5KQgcTLMVhefAH2s6E7NXtzk/dvSdxCStp7anYZTcEX6NGmdruthAgMyryfizbTgrLvXCpNs6xR+8DJwnW5z98f2cnpjEtJOs7hs0mY12Y5MPseYIng1OLPpcMTJ1Lxwf5TLF08XUexGJ8n+PDYt2J8uBmmi2ru7LbYrBhq2+waxNDDeZQvr99aEw+aouaiu9NIamAAtH/lk+4LjVAL/xvc3hKjJqCIwrCxn07yLlhytwgPlpWEDYNbwOvkjfgJJysYcBorw3+DxzW0/OS7DdLJcRF0lUFiqpAujDWhd00fScF657fDz1VDzvJ2Bo/hYeSK1cqFpYc8zK0eE6HuitjV16TjSQ20dyDFTrD/xpB5hnpKCXIUfBhySRzteAKcsBf6Av1PD/iZiEcaXQukFvSzeyTZsFQosYWNl348U5Pz5QEQhVfV/jzx9a7SsqhXuN5e2Vk5ofpag4F7GgexoJg4R2eVkFxbfCYaEdeHSMmSAymJYvUj7f0bIMsRYRGt64fxICxYktoEVaVlThKcWPPV33qJKZBT4q/1yEzW5J07AFcbWTe+Vpg/HN33kX0X3aGRoXf1fYYy7+s3HMorOF5JEAxoKdOcR09F62XenszNUf1e0uDIINudaAghdX3/8r0i5PvJQjYbLefTzSKipwt7zfSGcoFy0at9TrPX38O1jT4C83FQoYCoe1Sx0bWpsLPyxuVEPapguD9gLQP2+AlfN1LBAKRZKjjrJYlpHZ2+gNyE7x/po7o5k38FS4pFFSRt3fdIqqWsPKvsEcMS5auY+4CqafuQLLyT4Pp4KD9gY+ljWAeqkuuDvKGNrTrzH/SCV16zGcbz/npCl9VVv50uD7aYuKUrUdy7Wh4D1ipvrIQvxkCVq2upJjXcUc6clJdcn3sxoH3OlLDJMShwoOuAmGlnlO/ceUYs/SnINrfyjYHrJmIGbvRnAasn
 99cYwdae
 WYiELPr5ylMo4V9I5MQW4XYKUUs+IAozXNl62lkH92vboy4VVSPZRM6xh3msqtV06Z8EpCMadxc52pPpU85Yf6YUfbe2ZACfqf4ZW231i/vb5JcVYGCXFsBJbw3fpOlnpBtZwpyN8br32i1EwAxVyYsPIP+WPZztMV85b+3PvINnn44ivWLVP5JGPfdXpSNgkiDXJFhd417Cy7HbsLRHShAwU3hQnZ3ZFYNZyhKNWPCB3rvk=
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>
List-Subscribe: <mailto:majordomo@kvack.org>
List-Unsubscribe: <mailto:majordomo@kvack.org>

This patch makes it easier for IAA hardware acceleration in the iaa_crypto
driver to be loaded by default with "iaa_verify_compress" disabled, to
facilitate performance comparisons with software compressors (which also
do not run compress verification by default). Earlier, iaa_crypto compress
verification used to be enabled by default.

With this patch, if users want to enable compress verification, they can do
so with these steps:

  1) disable all the IAA device/wq bindings that happen at boot time
  2) rmmod iaa_crypto
  3) modprobe iaa_crypto
  4) echo 1 > /sys/bus/dsa/drivers/crypto/verify_compress
  5) re-run initialization of the IAA devices and wqs

Signed-off-by: Kanchana P Sridhar <kanchana.p.sridhar@intel.com>
---
 drivers/crypto/intel/iaa/iaa_crypto_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/crypto/intel/iaa/iaa_crypto_main.c b/drivers/crypto/intel/iaa/iaa_crypto_main.c
index 6c262b1eb09d..8e6859c97970 100644
--- a/drivers/crypto/intel/iaa/iaa_crypto_main.c
+++ b/drivers/crypto/intel/iaa/iaa_crypto_main.c
@@ -94,7 +94,7 @@ static bool iaa_crypto_enabled;
 static bool iaa_crypto_registered;
 
 /* Verify results of IAA compress or not */
-static bool iaa_verify_compress = true;
+static bool iaa_verify_compress = false;
 
 static ssize_t verify_compress_show(struct device_driver *driver, char *buf)
 {

From patchwork Fri Oct 18 06:40:55 2024
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: "Sridhar, Kanchana P" <kanchana.p.sridhar@intel.com>
X-Patchwork-Id: 13841234
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 8D809D3C550
	for <linux-mm@archiver.kernel.org>; Fri, 18 Oct 2024 06:41:22 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id BBD976B0095; Fri, 18 Oct 2024 02:41:11 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id B1B406B0099; Fri, 18 Oct 2024 02:41:11 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 836796B0095; Fri, 18 Oct 2024 02:41:11 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com
 [216.40.44.15])
	by kanga.kvack.org (Postfix) with ESMTP id 585C46B0096
	for <linux-mm@kvack.org>; Fri, 18 Oct 2024 02:41:11 -0400 (EDT)
Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay07.hostedemail.com (Postfix) with ESMTP id D67A8160521
	for <linux-mm@kvack.org>; Fri, 18 Oct 2024 06:40:57 +0000 (UTC)
X-FDA: 82685775522.27.B73FDF0
Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15])
	by imf24.hostedemail.com (Postfix) with ESMTP id CBBFA180011
	for <linux-mm@kvack.org>; Fri, 18 Oct 2024 06:41:06 +0000 (UTC)
Authentication-Results: imf24.hostedemail.com;
	dkim=pass header.d=intel.com header.s=Intel header.b=caziR4+x;
	dmarc=pass (policy=none) header.from=intel.com;
	spf=pass (imf24.hostedemail.com: domain of kanchana.p.sridhar@intel.com
 designates 192.198.163.15 as permitted sender)
 smtp.mailfrom=kanchana.p.sridhar@intel.com
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729233561; a=rsa-sha256;
	cv=none;
	b=ZOm9GhXlYgU8uXLqeXByWTu+n7x0YZN3CuEsC5J5e0On2GZFeViHHQraAdo7zBHMGJdF0a
	mo/Jg7R9t9aosggb1DaCUtz71fJrtMTYCLa9BahN0UuTSO5wrGGn8tZZ0Agy1st/jzbpMY
	aIKAVnpa4aP8qhtxXxBStullsa86Wfc=
ARC-Authentication-Results: i=1;
	imf24.hostedemail.com;
	dkim=pass header.d=intel.com header.s=Intel header.b=caziR4+x;
	dmarc=pass (policy=none) header.from=intel.com;
	spf=pass (imf24.hostedemail.com: domain of kanchana.p.sridhar@intel.com
 designates 192.198.163.15 as permitted sender)
 smtp.mailfrom=kanchana.p.sridhar@intel.com
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed;
 d=hostedemail.com;
	s=arc-20220608; t=1729233561;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references:dkim-signature;
	bh=6teQWrRk5ViUbGxsxYU0nUWeKM6iU84VA4tAM6R5lbo=;
	b=l5hnrCqZC9fr7weFZ8TQDh7VXm+ljMXPVnTbzBiPRmG74R8Bb8N2P63X7dd95FyupjD9fb
	fV0j8brmwUwEoE15KGUfnX+nnXOEVuoybim2Y1hU544v7ZeYpjoJgoUFlEMf7v0coSGSUT
	WTcoJI08Jb/NWkDd56Tkf0UqyOLKb6c=
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;
  d=intel.com; i=@intel.com; q=dns/txt; s=Intel;
  t=1729233669; x=1760769669;
  h=from:to:cc:subject:date:message-id:in-reply-to:
   references:mime-version:content-transfer-encoding;
  bh=GA4zNC/BA7uVqXos9F/5h3uPYjHVZ8vkHQMHlvAZyyg=;
  b=caziR4+x5br7ZIBcSRHRETqWoOmqIONiPZ/XPKlF8B0ai+T9d+48P3Xh
   2r1xnBZBxhiuEwwDC9CMN0yJXuToIpUdoJcj/d0V+95Zg6eRNisWFlYbO
   IXeHOMfsFB0h0lauJSOd5JSWOmDd4UyebkcwK0BnC0psW0x83gpEOqNCh
   QvcTUm8UiDwC2sYHjLcMe0ZuFdQSd97xW8tNax03yxALuD1BpBbMvCyq/
   FxYZO2jhEDY2MjCz23mzjv7QYUA1DO1Fc2zOL37aCFnuBMNUY1xRs5rg1
   oXRmfXIMolyi/FPEUlawacjKdMZjds9Qqhj4NETH0TiieEAArg6S9TEul
   Q==;
X-CSE-ConnectionGUID: tU66HmiPRj+onFG3fzL74g==
X-CSE-MsgGUID: TkBcT5fNT+iCQiUdk9heug==
X-IronPort-AV: E=McAfee;i="6700,10204,11228"; a="28884878"
X-IronPort-AV: E=Sophos;i="6.11,212,1725346800";
   d="scan'208";a="28884878"
Received: from orviesa003.jf.intel.com ([10.64.159.143])
  by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 17 Oct 2024 23:41:03 -0700
X-CSE-ConnectionGUID: zH0hGZmDQVO7EAYVusV2Hw==
X-CSE-MsgGUID: ymW4e2aWT3qeiSwr7IcC9g==
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="6.11,199,1725346800";
   d="scan'208";a="83607514"
Received: from jf5300-b11a338t.jf.intel.com ([10.242.51.6])
  by orviesa003.jf.intel.com with ESMTP; 17 Oct 2024 23:41:03 -0700
From: Kanchana P Sridhar <kanchana.p.sridhar@intel.com>
To: linux-kernel@vger.kernel.org,
	linux-mm@kvack.org,
	hannes@cmpxchg.org,
	yosryahmed@google.com,
	nphamcs@gmail.com,
	chengming.zhou@linux.dev,
	usamaarif642@gmail.com,
	ryan.roberts@arm.com,
	ying.huang@intel.com,
	21cnbao@gmail.com,
	akpm@linux-foundation.org,
	linux-crypto@vger.kernel.org,
	herbert@gondor.apana.org.au,
	davem@davemloft.net,
	clabbe@baylibre.com,
	ardb@kernel.org,
	ebiggers@google.com,
	surenb@google.com,
	kristen.c.accardi@intel.com,
	zanussi@kernel.org,
	viro@zeniv.linux.org.uk,
	brauner@kernel.org,
	jack@suse.cz,
	mcgrof@kernel.org,
	kees@kernel.org,
	joel.granados@kernel.org,
	bfoster@redhat.com,
	willy@infradead.org,
	linux-fsdevel@vger.kernel.org
Cc: wajdi.k.feghali@intel.com,
	vinodh.gopal@intel.com,
	kanchana.p.sridhar@intel.com
Subject: [RFC PATCH v1 07/13] crypto: iaa - Change cpu-to-iaa mappings to
 evenly balance cores to IAAs.
Date: Thu, 17 Oct 2024 23:40:55 -0700
Message-Id: <20241018064101.336232-8-kanchana.p.sridhar@intel.com>
X-Mailer: git-send-email 2.27.0
In-Reply-To: <20241018064101.336232-1-kanchana.p.sridhar@intel.com>
References: <20241018064101.336232-1-kanchana.p.sridhar@intel.com>
MIME-Version: 1.0
X-Rspamd-Queue-Id: CBBFA180011
X-Rspam-User: 
X-Rspamd-Server: rspam05
X-Stat-Signature: odi3z839akxzrdrzcdcdp9wfnmq9szn3
X-HE-Tag: 1729233666-405874
X-HE-Meta: 
 U2FsdGVkX18Sj7nA29iU/M8841Ow1a5+9YdRZWeV4Zq9OsnXzwahgHbM3kECff8deKgEvvSMHFVIY/9Xd0Gvtzo4TEL37sYEJjITSnDQN+77kBOo76j/QfD/ZuAhXZNLlHVPI8htVxaUNETz2HIHwOPQ1XMiciOb2ezEUVOejtvPXB0MYjgvhdmyoA25KSQIqA6X7EnExn4rfF3fOGQa12ys+D/Tuc6aveBT/xlJ6u3A2Aq7aVzbq36PO6YKQsZEL2Q1+5OA9ISDYA44xtbWy5sDX6UcMhG2npD4L1W7Rrz1BbAerJTtnQioIDoH4sYeFtiVCqUIOg+u8+zmJQm/o0jp4vMlAwis2CLk7pqy6DAYDcFz/bF0LgNIm2wKVCYej9Q13vL/nV2eZxkrmV3UflL5F8+4k/NlKsYyprry6QNkPpDjxRuX2qlAMBB+YGnzFArfEgpGr4ES1+lX7jEiVT4HIpdXwzUgkV3N1n00lBReYRdJSbGbg9j5EYaDJhuy7iZIDzyZNAR9BzAyaXN6b1uFgcx8650c+R03oiWKZt1bArSuT/4+j98NgrGOWQH0wXYVmwRKEtN1mth9qQpeTXe4dxyXmpfSZJR7TVlCe0EsFxY4E75MI8Na+72PX3AZt0mxLHcQVkb4yVf57BSCLviALe8sMg6U7sMAxSvOo0QA0FlE4JON14oJnxI02ipmH9MUbQao7VbJ5eCAj1N1opde0jzzqLEyX/sf34Ss2pnBQvXHXzIrScYyNAwGxfqxYTHt4i0P0v6lzlSnxDrWLGf71C2TpuyBZBK2zhFxhzOkcVs6XW0e6heYMDSA1wyGQjn8Z0VXOCX2FBGvmm03w7iFEsptOPb429JF/5R9r3sDB6zqID5akml2NDLD+bTb8aWU/NXn52RCCzOEcC7iWImkgCOLkROGwf1fOHmgj0fXQD42TbI2+lfQ7gpHJG0J4EbJyAS64AWVIBTeroT
 HgYdIryG
 eRsZgD4ik2UTJrrvAxgD+ZFqlLT3yoOV04IPrnEyVjBpdg2gHU80Rhh2TybzYjJ55elC6PkYs2f/8b2yKeRwMiTbaIjc+HzJTtopU81vBMC+UoVaG+pRlpCjCEBKZ5fW0ydVI9hm0/U7khpH2vFFcezDUeqzzm1iI+loLtKQiAC2/iODtG9KOPcPQ3w==
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>
List-Subscribe: <mailto:majordomo@kvack.org>
List-Unsubscribe: <mailto:majordomo@kvack.org>

This change distributes the cpus more evenly among the IAAs in each socket.

 Old algorithm to assign cpus to IAA:
 ------------------------------------
 If "nr_cpus" = nr_logical_cpus (includes hyper-threading), the current
 algorithm determines "nr_cpus_per_node" = nr_cpus / nr_nodes.

 Hence, on a 2-socket Sapphire Rapids server where each socket has 56 cores
 and 4 IAA devices, nr_cpus_per_node = 112.

 Further, cpus_per_iaa = (nr_nodes * nr_cpus_per_node) / nr_iaa
 Hence, cpus_per_iaa = 224/8 = 28.

 The iaa_crypto driver then assigns 28 "logical" node cpus per IAA device
 on that node, that results in this cpu-to-iaa mapping:

 lscpu|grep NUMA
 NUMA node(s):        2
 NUMA node0 CPU(s):   0-55,112-167
 NUMA node1 CPU(s):   56-111,168-223

 NUMA node 0:
 cpu   0-27    28-55  112-139  140-167
 iaa   iax1    iax3   iax5     iax7

 NUMA node 1:
 cpu   56-83  84-111  168-195   196-223
 iaa   iax9   iax11   iax13     iax15

 This appears non-optimal for a few reasons:

 1) The 2 logical threads on a core will get assigned to different IAA
    devices. For e.g.:
      cpu 0:   iax1
      cpu 112: iax5
 2) One of the logical threads on a core is assigned to an IAA that is not
    closest to that core. For e.g. cpu 112.
 3) If numactl is used to start processes sequentially on the logical
    cores, some of the IAA devices on the socket could be over-subscribed,
    while some could be under-utilized.

This patch introduces a scheme to more evenly balance the logical cores to
IAA devices on a socket.

 New algorithm to assign cpus to IAA:
 ------------------------------------
 We introduce a function "cpu_to_iaa()" that takes a logical cpu and
 returns the IAA device closest to it.

 If "nr_cpus" = nr_logical_cpus (includes hyper-threading), the new
 algorithm determines "nr_cpus_per_node" = topology_num_cores_per_package().

 Hence, on a 2-socket Sapphire Rapids server where each socket has 56 cores
 and 4 IAA devices, nr_cpus_per_node = 56.

 Further, cpus_per_iaa = (nr_nodes * nr_cpus_per_node) / nr_iaa
 Hence, cpus_per_iaa = 112/8 = 14.

 The iaa_crypto driver then assigns 14 "logical" node cpus per IAA device
 on that node, that results in this cpu-to-iaa mapping:

 NUMA node 0:
 cpu   0-13,112-125   14-27,126-139  28-41,140-153  42-55,154-167
 iaa   iax1           iax3           iax5           iax7

 NUMA node 1:
 cpu   56-69,168-181  70-83,182-195  84-97,196-209   98-111,210-223
 iaa   iax9           iax11          iax13           iax15

 This resolves the 3 issues with non-optimality of cpu-to-iaa mappings
 pointed out earlier with the existing approach.

Originally-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Kanchana P Sridhar <kanchana.p.sridhar@intel.com>
---
 drivers/crypto/intel/iaa/iaa_crypto_main.c | 84 ++++++++++++++--------
 1 file changed, 54 insertions(+), 30 deletions(-)

diff --git a/drivers/crypto/intel/iaa/iaa_crypto_main.c b/drivers/crypto/intel/iaa/iaa_crypto_main.c
index 8e6859c97970..c854a7a1aaa4 100644
--- a/drivers/crypto/intel/iaa/iaa_crypto_main.c
+++ b/drivers/crypto/intel/iaa/iaa_crypto_main.c
@@ -55,6 +55,46 @@ static struct idxd_wq *wq_table_next_wq(int cpu)
 	return entry->wqs[entry->cur_wq];
 }
 
+/*
+ * Given a cpu, find the closest IAA instance.  The idea is to try to
+ * choose the most appropriate IAA instance for a caller and spread
+ * available workqueues around to clients.
+ */
+static inline int cpu_to_iaa(int cpu)
+{
+	int node, n_cpus = 0, test_cpu, iaa = 0;
+	int nr_iaa_per_node;
+	const struct cpumask *node_cpus;
+
+	if (!nr_nodes)
+		return 0;
+
+	nr_iaa_per_node = nr_iaa / nr_nodes;
+	if (!nr_iaa_per_node)
+		return 0;
+
+	for_each_online_node(node) {
+		node_cpus = cpumask_of_node(node);
+		if (!cpumask_test_cpu(cpu, node_cpus))
+			continue;
+
+		for_each_cpu(test_cpu, node_cpus) {
+			if ((n_cpus % nr_cpus_per_node) == 0)
+				iaa = node * nr_iaa_per_node;
+
+			if (test_cpu == cpu)
+				return iaa;
+
+			n_cpus++;
+
+			if ((n_cpus % cpus_per_iaa) == 0)
+				iaa++;
+		}
+	}
+
+	return -1;
+}
+
 static void wq_table_add(int cpu, struct idxd_wq *wq)
 {
 	struct wq_table_entry *entry = per_cpu_ptr(wq_table, cpu);
@@ -895,8 +935,7 @@ static int wq_table_add_wqs(int iaa, int cpu)
  */
 static void rebalance_wq_table(void)
 {
-	const struct cpumask *node_cpus;
-	int node, cpu, iaa = -1;
+	int cpu, iaa;
 
 	if (nr_iaa == 0)
 		return;
@@ -906,37 +945,22 @@ static void rebalance_wq_table(void)
 
 	clear_wq_table();
 
-	if (nr_iaa == 1) {
-		for (cpu = 0; cpu < nr_cpus; cpu++) {
-			if (WARN_ON(wq_table_add_wqs(0, cpu))) {
-				pr_debug("could not add any wqs for iaa 0 to cpu %d!\n", cpu);
-				return;
-			}
-		}
-
-		return;
-	}
-
-	for_each_node_with_cpus(node) {
-		node_cpus = cpumask_of_node(node);
-
-		for (cpu = 0; cpu <  cpumask_weight(node_cpus); cpu++) {
-			int node_cpu = cpumask_nth(cpu, node_cpus);
-
-			if (WARN_ON(node_cpu >= nr_cpu_ids)) {
-				pr_debug("node_cpu %d doesn't exist!\n", node_cpu);
-				return;
-			}
+	for (cpu = 0; cpu < nr_cpus; cpu++) {
+		iaa = cpu_to_iaa(cpu);
+		pr_debug("rebalance: cpu=%d iaa=%d\n", cpu, iaa);
 
-			if ((cpu % cpus_per_iaa) == 0)
-				iaa++;
+		if (WARN_ON(iaa == -1)) {
+			pr_debug("rebalance (cpu_to_iaa(%d)) failed!\n", cpu);
+			return;
+		}
 
-			if (WARN_ON(wq_table_add_wqs(iaa, node_cpu))) {
-				pr_debug("could not add any wqs for iaa %d to cpu %d!\n", iaa, cpu);
-				return;
-			}
+		if (WARN_ON(wq_table_add_wqs(iaa, cpu))) {
+			pr_debug("could not add any wqs for iaa %d to cpu %d!\n", iaa, cpu);
+			return;
 		}
 	}
+
+	pr_debug("Finished rebalance local wqs.");
 }
 
 static inline int check_completion(struct device *dev,
@@ -2084,7 +2108,7 @@ static int __init iaa_crypto_init_module(void)
 		pr_err("IAA couldn't find any nodes with cpus\n");
 		return -ENODEV;
 	}
-	nr_cpus_per_node = nr_cpus / nr_nodes;
+	nr_cpus_per_node = topology_num_cores_per_package();
 
 	if (crypto_has_comp("deflate-generic", 0, 0))
 		deflate_generic_tfm = crypto_alloc_comp("deflate-generic", 0, 0);

From patchwork Fri Oct 18 06:40:56 2024
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: "Sridhar, Kanchana P" <kanchana.p.sridhar@intel.com>
X-Patchwork-Id: 13841235
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 5A775D3C54C
	for <linux-mm@archiver.kernel.org>; Fri, 18 Oct 2024 06:41:25 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id 1347B6B0098; Fri, 18 Oct 2024 02:41:13 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 0E4FA6B0099; Fri, 18 Oct 2024 02:41:13 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id E034F6B009B; Fri, 18 Oct 2024 02:41:12 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com
 [216.40.44.11])
	by kanga.kvack.org (Postfix) with ESMTP id BB83D6B0098
	for <linux-mm@kvack.org>; Fri, 18 Oct 2024 02:41:12 -0400 (EDT)
Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay02.hostedemail.com (Postfix) with ESMTP id DACBB1205CB
	for <linux-mm@kvack.org>; Fri, 18 Oct 2024 06:41:01 +0000 (UTC)
X-FDA: 82685775732.15.CA6C420
Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15])
	by imf19.hostedemail.com (Postfix) with ESMTP id EACF91A000C
	for <linux-mm@kvack.org>; Fri, 18 Oct 2024 06:40:56 +0000 (UTC)
Authentication-Results: imf19.hostedemail.com;
	dkim=pass header.d=intel.com header.s=Intel header.b=JaKljNEt;
	spf=pass (imf19.hostedemail.com: domain of kanchana.p.sridhar@intel.com
 designates 192.198.163.15 as permitted sender)
 smtp.mailfrom=kanchana.p.sridhar@intel.com;
	dmarc=pass (policy=none) header.from=intel.com
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed;
 d=hostedemail.com;
	s=arc-20220608; t=1729233523;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references:dkim-signature;
	bh=jzdwlumSG2pr39IYIrbXBX52SXwXZHtVtLY9ZhRPJLU=;
	b=JthwI5qVievQP1vwAFEljVgfJ9AFjNAAjwRryh+YNkDTvnKqazw5P6feFJqHyXrBmIPOwT
	uhIwIh4fTLDTarmjKS/C42Jw0pdigwSrz5mRB1CImbtpUcjpKju2R0+IftjUC4wxqofNvV
	oF/BTn9UVeM+eqWfiN2hRtVjL5Awbxg=
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729233523; a=rsa-sha256;
	cv=none;
	b=ndAfMGvB3tRKymDyfh8b5Ebtaij2vyHYkewTLc7lKEo+M0xox4AiTP+gXMPAOCk8O5i+ZF
	TUCpgFXYgc/UqmbGCn5VN0s+hxRfa8kaXmRbvnIr9wClD3it3ldfUNHPwzHSLJNdpQsG2U
	RI+DQpsSiCWMWd5Phf78s8/oIFKcLFE=
ARC-Authentication-Results: i=1;
	imf19.hostedemail.com;
	dkim=pass header.d=intel.com header.s=Intel header.b=JaKljNEt;
	spf=pass (imf19.hostedemail.com: domain of kanchana.p.sridhar@intel.com
 designates 192.198.163.15 as permitted sender)
 smtp.mailfrom=kanchana.p.sridhar@intel.com;
	dmarc=pass (policy=none) header.from=intel.com
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;
  d=intel.com; i=@intel.com; q=dns/txt; s=Intel;
  t=1729233670; x=1760769670;
  h=from:to:cc:subject:date:message-id:in-reply-to:
   references:mime-version:content-transfer-encoding;
  bh=nJkwHZhq8LUwIQWuM4nmuZlPd/LXd0bS1ate2IE2Ous=;
  b=JaKljNEtPy1LrAF3kIQOhqSRLXlEP5usoEVcRHtoyFRJ7v2dyoFzgGht
   a3x8DiiL8Z3ftqt1DXAUHbPbMK/bU6Xe6JnrCWgkBvMjbl9CpyNwJJ8/s
   sLeHCnx1ymndX9l31ZPn02s6CNgavs05bHSETUBOPEBJRDd1lXjLRWtbU
   BHJP6pyiNu+ThAERdK3ch1QmrKUz20NFgTvH3ObvfQp/PHnHOWb/XZd+T
   sORn9ABn1WUWV1CI4qO0gI/ZoP9LAZT8mp/a0xDsKNqAqlCrnl44g6C2F
   b+YDHkVFGe/C2ECDc4kvfClhNaTcXhnosJzzNs1UkRAD70VbiFnX7iFge
   w==;
X-CSE-ConnectionGUID: QRDfSgnERPu3M715853bjg==
X-CSE-MsgGUID: 7C/9rk+dSCOyS9nGmi5yOQ==
X-IronPort-AV: E=McAfee;i="6700,10204,11228"; a="28884893"
X-IronPort-AV: E=Sophos;i="6.11,212,1725346800";
   d="scan'208";a="28884893"
Received: from orviesa003.jf.intel.com ([10.64.159.143])
  by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 17 Oct 2024 23:41:03 -0700
X-CSE-ConnectionGUID: zewXamuiRemTPufLHe37/A==
X-CSE-MsgGUID: j/tcH1zEQrytip/oRjhc+w==
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="6.11,199,1725346800";
   d="scan'208";a="83607519"
Received: from jf5300-b11a338t.jf.intel.com ([10.242.51.6])
  by orviesa003.jf.intel.com with ESMTP; 17 Oct 2024 23:41:03 -0700
From: Kanchana P Sridhar <kanchana.p.sridhar@intel.com>
To: linux-kernel@vger.kernel.org,
	linux-mm@kvack.org,
	hannes@cmpxchg.org,
	yosryahmed@google.com,
	nphamcs@gmail.com,
	chengming.zhou@linux.dev,
	usamaarif642@gmail.com,
	ryan.roberts@arm.com,
	ying.huang@intel.com,
	21cnbao@gmail.com,
	akpm@linux-foundation.org,
	linux-crypto@vger.kernel.org,
	herbert@gondor.apana.org.au,
	davem@davemloft.net,
	clabbe@baylibre.com,
	ardb@kernel.org,
	ebiggers@google.com,
	surenb@google.com,
	kristen.c.accardi@intel.com,
	zanussi@kernel.org,
	viro@zeniv.linux.org.uk,
	brauner@kernel.org,
	jack@suse.cz,
	mcgrof@kernel.org,
	kees@kernel.org,
	joel.granados@kernel.org,
	bfoster@redhat.com,
	willy@infradead.org,
	linux-fsdevel@vger.kernel.org
Cc: wajdi.k.feghali@intel.com,
	vinodh.gopal@intel.com,
	kanchana.p.sridhar@intel.com
Subject: [RFC PATCH v1 08/13] crypto: iaa - Distribute compress jobs to all
 IAA devices on a NUMA node.
Date: Thu, 17 Oct 2024 23:40:56 -0700
Message-Id: <20241018064101.336232-9-kanchana.p.sridhar@intel.com>
X-Mailer: git-send-email 2.27.0
In-Reply-To: <20241018064101.336232-1-kanchana.p.sridhar@intel.com>
References: <20241018064101.336232-1-kanchana.p.sridhar@intel.com>
MIME-Version: 1.0
X-Stat-Signature: hjmrwzacahnfdm1q7zbcus6ugn7sp99d
X-Rspamd-Queue-Id: EACF91A000C
X-Rspam-User: 
X-Rspamd-Server: rspam08
X-HE-Tag: 1729233656-988653
X-HE-Meta: 
 U2FsdGVkX1+RQwBC8Hea6UrAW8h9DdSHmz/gT1KJwvLqWLgOut7ROgoQC6JEB5urnIz6XVePcZK+/6PA690iU2a/xiK88LoN7Hd4/E4IStYeP6EjZCgxiWlxuIDBxuTKYKHA3Rx5sVw0Po0bDfzIdhbsg9aHiDItaBpo/7vtaMu3o/dxR9E+d6MMtgTm34lCKkX7ow4SyF7xkLay+04DQpRpcieo7PCHAl+r96riXT/uQKyyafVEZkNKDXN8T481/ho7xC9PwDFChKdegaLRKxIpG5ZHh0Mfna1ygkztSXZ2Vtw0K/A1hMXfPzqkWhfRfjDiFNf/d+j8SjLHeQLcl+S7thdVaGQHmet0j3YKiF45/rpcer1QQRfzK/ClcQRveqNoNe+nIKySMRmiMgJiVv4q4G+gnajif4Ojs/cCvxM+wtQ04xTwTucQNYxFgFvbJqga1puh40RmLpMTzvcGbVi0jVDW5WA/XXam5Gp/tEQVlrDhitrmWxjgZpLit9P42WP+k073Mh6SkcB9J9DvWqLTC/Av3G1pQIBi4ghlSApYO9UuhE2Wi1P9quaY+iGQ7bJGpY11dK6OyFwtV0s1EnvKqYLFXgkM8Ukx8SzWbTW0OFOfD0iI6VBN36G/MP9ldhSON0b7PUASj4q0mBXtaIu8EhpT1E3smvfYulrs1+OTR0gPYg3l7npRmA9swzjzWO2wBLe1BIOO0timo/CR6rQeQYKEnl1XgtAgobppQ/qlaD0ZeszQJdakgoXjdZk59JobdL82AvXKG+g1/e0wvd5bVDnIfDtX1gW+n0pkTCk4nebcuv9AxliFORvA5SIXLUe9A4uFTJ9Q+uxyOZb/H1XlAdZiBBfi5k6hTEY3dMoH07YNmpQlUAO2OukaCKiIc7PNES6/+/nfjV/W+qIdQkKaZ90TKqR7Ejzb203EV1TcJQ5+VZTn0w7YfEtxPZtbnVrY99NM2+1nT1K9VXl
 VK7CTai+
 SSIgYBWCO9fVLgjnSTL3qNEIlm1mV3PZvqJ6bmh92B2hwbRjbq2apezgGn+pIV41oCZD1RAwVIzna2zrwC7eV6QKfDBa8W6vn5zKbNRw3eHj04qb8or5NAmYBq6raZ6J7TYtBUQUUaRwTrChQaaimAoa0vyil44HqeUNbltEvaAN+Qcvt09BRhtqiGQ==
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>
List-Subscribe: <mailto:majordomo@kvack.org>
List-Unsubscribe: <mailto:majordomo@kvack.org>

This change enables processes running on any logical core on a NUMA node to
use all the IAA devices enabled on that NUMA node for compress jobs. In
other words, compressions originating from any process in a node will be
distributed in round-robin manner to the available IAA devices on the same
socket. The main premise behind this change is to make sure that no
compress engines on any IAA device are left un-utilized/under-utilized. In
other words, the compress engines on all IAA devices are considered a
global resource for that socket.

This allows the use of all IAA devices present in a given NUMA node for
(batched) compressions originating from zswap/zram, from all cores
on this node.

A new per-cpu "global_wq_table" implements this in the iaa_crypto driver.
We can think of the global WQ per IAA as a WQ to which all cores on
that socket can submit compress jobs.

To avail of this feature, the user must configure 2 WQs per IAA in order to
enable distribution of compress jobs to multiple IAA devices.

Each IAA will have 2 WQs:
 wq.0 (local WQ):
   Used for decompress jobs from cores mapped by the cpu_to_iaa() "even
   balancing of logical cores to IAA devices" algorithm.

 wq.1 (global WQ):
   Used for compress jobs from *all* logical cores on that socket.

The iaa_crypto driver will place all global WQs from all same-socket IAA
devices in the global_wq_table per cpu on that socket. When the driver
receives a compress job, it will lookup the "next" global WQ in the cpu's
global_wq_table to submit the descriptor.

The starting wq in the global_wq_table for each cpu is the global wq
associated with the IAA nearest to it, so that we stagger the starting
global wq for each process. This results in very uniform usage of all IAAs
for compress jobs.

Two new driver module parameters are added for this feature:

g_wqs_per_iaa (default 1):

 /sys/bus/dsa/drivers/crypto/g_wqs_per_iaa

 This represents the number of global WQs that can be configured per IAA
 device. The default is 1, and is the recommended setting to enable the use
 of this feature once the user configures 2 WQs per IAA using higher level
 scripts as described in
 Documentation/driver-api/crypto/iaa/iaa-crypto.rst.

g_consec_descs_per_gwq (default 1):

 /sys/bus/dsa/drivers/crypto/g_consec_descs_per_gwq

 This represents the number of consecutive compress jobs that will be
 submitted to the same global WQ (i.e. to the same IAA device) from a given
 core, before moving to the next global WQ. The default is 1, which is also
 the recommended setting to avail of this feature.

The decompress jobs from any core will be sent to the "local" IAA, namely
the one that the driver assigns with the cpu_to_iaa() mapping algorithm
that evenly balances the assignment of logical cores to IAA devices on a
NUMA node.

On a 2-socket Sapphire Rapids server where each socket has 56 cores and
4 IAA devices, this is how the compress/decompress jobs will be mapped
when the user configures 2 WQs per IAA device (which implies wq.1 will
be added to the global WQ table for each logical core on that NUMA node):

 lscpu|grep NUMA
 NUMA node(s):        2
 NUMA node0 CPU(s):   0-55,112-167
 NUMA node1 CPU(s):   56-111,168-223

 Compress jobs:
 --------------
 NUMA node 0:
 All cpus (0-55,112-167) can send compress jobs to all IAA devices on the
 socket (iax1/iax3/iax5/iax7) in round-robin manner:
 iaa   iax1           iax3           iax5           iax7

 NUMA node 1:
 All cpus (56-111,168-223) can send compress jobs to all IAA devices on the
 socket (iax9/iax11/iax13/iax15) in round-robin manner:
 iaa   iax9           iax11          iax13           iax15

 Decompress jobs:
 ----------------
 NUMA node 0:
 cpu   0-13,112-125   14-27,126-139  28-41,140-153  42-55,154-167
 iaa   iax1           iax3           iax5           iax7

 NUMA node 1:
 cpu   56-69,168-181  70-83,182-195  84-97,196-209   98-111,210-223
 iaa   iax9           iax11          iax13           iax15

Signed-off-by: Kanchana P Sridhar <kanchana.p.sridhar@intel.com>
---
 drivers/crypto/intel/iaa/iaa_crypto_main.c | 305 ++++++++++++++++++++-
 1 file changed, 290 insertions(+), 15 deletions(-)

diff --git a/drivers/crypto/intel/iaa/iaa_crypto_main.c b/drivers/crypto/intel/iaa/iaa_crypto_main.c
index c854a7a1aaa4..2d6c517e9d9b 100644
--- a/drivers/crypto/intel/iaa/iaa_crypto_main.c
+++ b/drivers/crypto/intel/iaa/iaa_crypto_main.c
@@ -29,14 +29,23 @@ static unsigned int nr_iaa;
 static unsigned int nr_cpus;
 static unsigned int nr_nodes;
 static unsigned int nr_cpus_per_node;
-
 /* Number of physical cpus sharing each iaa instance */
 static unsigned int cpus_per_iaa;
 
 static struct crypto_comp *deflate_generic_tfm;
 
 /* Per-cpu lookup table for balanced wqs */
-static struct wq_table_entry __percpu *wq_table;
+static struct wq_table_entry __percpu *wq_table = NULL;
+
+/* Per-cpu lookup table for global wqs shared by all cpus. */
+static struct wq_table_entry __percpu *global_wq_table = NULL;
+
+/*
+ * Per-cpu counter of consecutive descriptors allocated to
+ * the same wq in the global_wq_table, so that we know
+ * when to switch to the next wq in the global_wq_table.
+ */
+static int __percpu *num_consec_descs_per_wq = NULL;
 
 static struct idxd_wq *wq_table_next_wq(int cpu)
 {
@@ -104,26 +113,68 @@ static void wq_table_add(int cpu, struct idxd_wq *wq)
 
 	entry->wqs[entry->n_wqs++] = wq;
 
-	pr_debug("%s: added iaa wq %d.%d to idx %d of cpu %d\n", __func__,
-		 entry->wqs[entry->n_wqs - 1]->idxd->id,
-		 entry->wqs[entry->n_wqs - 1]->id, entry->n_wqs - 1, cpu);
+	pr_debug("%s: added iaa local wq %d.%d to idx %d of cpu %d\n", __func__,
+		entry->wqs[entry->n_wqs - 1]->idxd->id,
+		entry->wqs[entry->n_wqs - 1]->id, entry->n_wqs - 1, cpu);
+}
+
+static void global_wq_table_add(int cpu, struct idxd_wq *wq)
+{
+	struct wq_table_entry *entry = per_cpu_ptr(global_wq_table, cpu);
+
+	if (WARN_ON(entry->n_wqs == entry->max_wqs))
+		return;
+
+	entry->wqs[entry->n_wqs++] = wq;
+
+	pr_debug("%s: added iaa global wq %d.%d to idx %d of cpu %d\n", __func__,
+		entry->wqs[entry->n_wqs - 1]->idxd->id,
+		entry->wqs[entry->n_wqs - 1]->id, entry->n_wqs - 1, cpu);
+}
+
+static void global_wq_table_set_start_wq(int cpu)
+{
+	struct wq_table_entry *entry = per_cpu_ptr(global_wq_table, cpu);
+	int start_wq = (entry->n_wqs / nr_iaa) * cpu_to_iaa(cpu);
+
+	if ((start_wq >= 0) && (start_wq < entry->n_wqs))
+		entry->cur_wq = start_wq;
 }
 
 static void wq_table_free_entry(int cpu)
 {
 	struct wq_table_entry *entry = per_cpu_ptr(wq_table, cpu);
 
-	kfree(entry->wqs);
-	memset(entry, 0, sizeof(*entry));
+	if (entry) {
+		kfree(entry->wqs);
+		memset(entry, 0, sizeof(*entry));
+	}
+
+	entry = per_cpu_ptr(global_wq_table, cpu);
+
+	if (entry) {
+		kfree(entry->wqs);
+		memset(entry, 0, sizeof(*entry));
+	}
 }
 
 static void wq_table_clear_entry(int cpu)
 {
 	struct wq_table_entry *entry = per_cpu_ptr(wq_table, cpu);
 
-	entry->n_wqs = 0;
-	entry->cur_wq = 0;
-	memset(entry->wqs, 0, entry->max_wqs * sizeof(struct idxd_wq *));
+	if (entry) {
+		entry->n_wqs = 0;
+		entry->cur_wq = 0;
+		memset(entry->wqs, 0, entry->max_wqs * sizeof(struct idxd_wq *));
+	}
+
+	entry = per_cpu_ptr(global_wq_table, cpu);
+
+	if (entry) {
+		entry->n_wqs = 0;
+		entry->cur_wq = 0;
+		memset(entry->wqs, 0, entry->max_wqs * sizeof(struct idxd_wq *));
+	}
 }
 
 LIST_HEAD(iaa_devices);
@@ -163,6 +214,70 @@ static ssize_t verify_compress_store(struct device_driver *driver,
 }
 static DRIVER_ATTR_RW(verify_compress);
 
+/* Number of global wqs per iaa*/
+static int g_wqs_per_iaa = 1;
+
+static ssize_t g_wqs_per_iaa_show(struct device_driver *driver, char *buf)
+{
+	return sprintf(buf, "%d\n", g_wqs_per_iaa);
+}
+
+static ssize_t g_wqs_per_iaa_store(struct device_driver *driver,
+				     const char *buf, size_t count)
+{
+	int ret = -EBUSY;
+
+	mutex_lock(&iaa_devices_lock);
+
+	if (iaa_crypto_enabled)
+		goto out;
+
+	ret = kstrtoint(buf, 10, &g_wqs_per_iaa);
+	if (ret)
+		goto out;
+
+	ret = count;
+out:
+	mutex_unlock(&iaa_devices_lock);
+
+	return ret;
+}
+static DRIVER_ATTR_RW(g_wqs_per_iaa);
+
+/*
+ * Number of consecutive descriptors to allocate from a
+ * given global wq before switching to the next wq in
+ * the global_wq_table.
+ */
+static int g_consec_descs_per_gwq = 1;
+
+static ssize_t g_consec_descs_per_gwq_show(struct device_driver *driver, char *buf)
+{
+	return sprintf(buf, "%d\n", g_consec_descs_per_gwq);
+}
+
+static ssize_t g_consec_descs_per_gwq_store(struct device_driver *driver,
+				     const char *buf, size_t count)
+{
+	int ret = -EBUSY;
+
+	mutex_lock(&iaa_devices_lock);
+
+	if (iaa_crypto_enabled)
+		goto out;
+
+	ret = kstrtoint(buf, 10, &g_consec_descs_per_gwq);
+	if (ret)
+		goto out;
+
+	ret = count;
+out:
+	mutex_unlock(&iaa_devices_lock);
+
+	return ret;
+}
+static DRIVER_ATTR_RW(g_consec_descs_per_gwq);
+
 /*
  * The iaa crypto driver supports three 'sync' methods determining how
  * compressions and decompressions are performed:
@@ -751,7 +866,20 @@ static void free_wq_table(void)
 	for (cpu = 0; cpu < nr_cpus; cpu++)
 		wq_table_free_entry(cpu);
 
-	free_percpu(wq_table);
+	if (wq_table) {
+		free_percpu(wq_table);
+		wq_table = NULL;
+	}
+
+	if (global_wq_table) {
+		free_percpu(global_wq_table);
+		global_wq_table = NULL;
+	}
+
+	if (num_consec_descs_per_wq) {
+		free_percpu(num_consec_descs_per_wq);
+		num_consec_descs_per_wq = NULL;
+	}
 
 	pr_debug("freed wq table\n");
 }
@@ -774,6 +902,38 @@ static int alloc_wq_table(int max_wqs)
 		}
 
 		entry->max_wqs = max_wqs;
+		entry->n_wqs = 0;
+		entry->cur_wq = 0;
+	}
+
+	global_wq_table = alloc_percpu(struct wq_table_entry);
+	if (!global_wq_table) {
+		free_wq_table();
+		return -ENOMEM;
+	}
+
+	for (cpu = 0; cpu < nr_cpus; cpu++) {
+		entry = per_cpu_ptr(global_wq_table, cpu);
+		entry->wqs = kzalloc(GFP_KERNEL, max_wqs * sizeof(struct wq *));
+		if (!entry->wqs) {
+			free_wq_table();
+			return -ENOMEM;
+		}
+
+		entry->max_wqs = max_wqs;
+		entry->n_wqs = 0;
+		entry->cur_wq = 0;
+	}
+
+	num_consec_descs_per_wq = alloc_percpu(int);
+	if (!num_consec_descs_per_wq) {
+		free_wq_table();
+		return -ENOMEM;
+	}
+
+	for (cpu = 0; cpu < nr_cpus; cpu++) {
+		int *num_consec_descs = per_cpu_ptr(num_consec_descs_per_wq, cpu);
+		*num_consec_descs = 0;
 	}
 
 	pr_debug("initialized wq table\n");
@@ -912,9 +1072,14 @@ static int wq_table_add_wqs(int iaa, int cpu)
 	}
 
 	list_for_each_entry(iaa_wq, &found_device->wqs, list) {
-		wq_table_add(cpu, iaa_wq->wq);
+
+		if (((found_device->n_wq - g_wqs_per_iaa) < 1) ||
+			(n_wqs_added < (found_device->n_wq - g_wqs_per_iaa))) {
+			wq_table_add(cpu, iaa_wq->wq);
+		}
+
 		pr_debug("rebalance: added wq for cpu=%d: iaa wq %d.%d\n",
-			 cpu, iaa_wq->wq->idxd->id, iaa_wq->wq->id);
+			cpu, iaa_wq->wq->idxd->id, iaa_wq->wq->id);
 		n_wqs_added++;
 	}
 
@@ -927,6 +1092,63 @@ static int wq_table_add_wqs(int iaa, int cpu)
 	return ret;
 }
 
+static int global_wq_table_add_wqs(void)
+{
+	struct iaa_device *iaa_device;
+	int ret = 0, n_wqs_added;
+	struct idxd_device *idxd;
+	struct iaa_wq *iaa_wq;
+	struct pci_dev *pdev;
+	struct device *dev;
+	int cpu, node, node_of_cpu = -1;
+
+	for (cpu = 0; cpu < nr_cpus; cpu++) {
+
+#ifdef CONFIG_NUMA
+		node_of_cpu = -1;
+		for_each_online_node(node) {
+			const struct cpumask *node_cpus;
+			node_cpus = cpumask_of_node(node);
+			if (!cpumask_test_cpu(cpu, node_cpus))
+				continue;
+			node_of_cpu = node;
+			break;
+		}
+#endif
+		list_for_each_entry(iaa_device, &iaa_devices, list) {
+			idxd = iaa_device->idxd;
+			pdev = idxd->pdev;
+			dev = &pdev->dev;
+
+#ifdef CONFIG_NUMA
+			if (dev && (node_of_cpu != dev->numa_node))
+				continue;
+#endif
+
+			if (iaa_device->n_wq <= g_wqs_per_iaa)
+				continue;
+
+			n_wqs_added = 0;
+
+			list_for_each_entry(iaa_wq, &iaa_device->wqs, list) {
+
+				if (n_wqs_added < (iaa_device->n_wq - g_wqs_per_iaa)) {
+					n_wqs_added++;
+				}
+				else {
+					global_wq_table_add(cpu, iaa_wq->wq);
+					pr_debug("rebalance: added global wq for cpu=%d: iaa wq %d.%d\n",
+						cpu, iaa_wq->wq->idxd->id, iaa_wq->wq->id);
+				}
+			}
+		}
+
+		global_wq_table_set_start_wq(cpu);
+	}
+
+	return ret;
+}
+
 /*
  * Rebalance the wq table so that given a cpu, it's easy to find the
  * closest IAA instance.  The idea is to try to choose the most
@@ -961,6 +1183,7 @@ static void rebalance_wq_table(void)
 	}
 
 	pr_debug("Finished rebalance local wqs.");
+	global_wq_table_add_wqs();
 }
 
 static inline int check_completion(struct device *dev,
@@ -1509,6 +1732,27 @@ static int iaa_decompress(struct crypto_tfm *tfm, struct acomp_req *req,
 	goto out;
 }
 
+/*
+ * Caller should make sure to call only if the
+ * per_cpu_ptr "global_wq_table" is non-NULL
+ * and has at least one wq configured.
+ */
+static struct idxd_wq *global_wq_table_next_wq(int cpu)
+{
+	struct wq_table_entry *entry = per_cpu_ptr(global_wq_table, cpu);
+	int *num_consec_descs = per_cpu_ptr(num_consec_descs_per_wq, cpu);
+
+	if ((*num_consec_descs) == g_consec_descs_per_gwq) {
+		if (++entry->cur_wq >= entry->n_wqs)
+			entry->cur_wq = 0;
+		*num_consec_descs = 0;
+	}
+
+	++(*num_consec_descs);
+
+	return entry->wqs[entry->cur_wq];
+}
+
 static int iaa_comp_acompress(struct acomp_req *req)
 {
 	struct iaa_compression_ctx *compression_ctx;
@@ -1521,6 +1765,7 @@ static int iaa_comp_acompress(struct acomp_req *req)
 	struct idxd_wq *wq;
 	struct device *dev;
 	int order = -1;
+	struct wq_table_entry *entry;
 
 	compression_ctx = crypto_tfm_ctx(tfm);
 
@@ -1535,8 +1780,15 @@ static int iaa_comp_acompress(struct acomp_req *req)
 	}
 
 	cpu = get_cpu();
-	wq = wq_table_next_wq(cpu);
+	entry = per_cpu_ptr(global_wq_table, cpu);
+
+	if (!entry || entry->n_wqs == 0) {
+		wq = wq_table_next_wq(cpu);
+	} else {
+		wq = global_wq_table_next_wq(cpu);
+	}
 	put_cpu();
+
 	if (!wq) {
 		pr_debug("no wq configured for cpu=%d\n", cpu);
 		return -ENODEV;
@@ -2145,13 +2397,32 @@ static int __init iaa_crypto_init_module(void)
 		goto err_sync_attr_create;
 	}
 
+	ret = driver_create_file(&iaa_crypto_driver.drv,
+				&driver_attr_g_wqs_per_iaa);
+	if (ret) {
+		pr_debug("IAA g_wqs_per_iaa attr creation failed\n");
+		goto err_g_wqs_per_iaa_attr_create;
+	}
+
+	ret = driver_create_file(&iaa_crypto_driver.drv,
+				&driver_attr_g_consec_descs_per_gwq);
+	if (ret) {
+		pr_debug("IAA g_consec_descs_per_gwq attr creation failed\n");
+		goto err_g_consec_descs_per_gwq_attr_create;
+	}
+
 	if (iaa_crypto_debugfs_init())
 		pr_warn("debugfs init failed, stats not available\n");
 
 	pr_debug("initialized\n");
 out:
 	return ret;
-
+err_g_consec_descs_per_gwq_attr_create:
+	driver_remove_file(&iaa_crypto_driver.drv,
+			   &driver_attr_g_wqs_per_iaa);
+err_g_wqs_per_iaa_attr_create:
+	driver_remove_file(&iaa_crypto_driver.drv,
+			   &driver_attr_sync_mode);
 err_sync_attr_create:
 	driver_remove_file(&iaa_crypto_driver.drv,
 			   &driver_attr_verify_compress);
@@ -2175,6 +2446,10 @@ static void __exit iaa_crypto_cleanup_module(void)
 			   &driver_attr_sync_mode);
 	driver_remove_file(&iaa_crypto_driver.drv,
 			   &driver_attr_verify_compress);
+	driver_remove_file(&iaa_crypto_driver.drv,
+			   &driver_attr_g_wqs_per_iaa);
+	driver_remove_file(&iaa_crypto_driver.drv,
+			   &driver_attr_g_consec_descs_per_gwq);
 	idxd_driver_unregister(&iaa_crypto_driver);
 	iaa_aecs_cleanup_fixed();
 	crypto_free_comp(deflate_generic_tfm);

From patchwork Fri Oct 18 06:40:57 2024
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: "Sridhar, Kanchana P" <kanchana.p.sridhar@intel.com>
X-Patchwork-Id: 13841236
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 20DB4D3C550
	for <linux-mm@archiver.kernel.org>; Fri, 18 Oct 2024 06:41:28 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id E2D9A6B0099; Fri, 18 Oct 2024 02:41:13 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id DB4736B009C; Fri, 18 Oct 2024 02:41:13 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id B94466B009D; Fri, 18 Oct 2024 02:41:13 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com
 [216.40.44.17])
	by kanga.kvack.org (Postfix) with ESMTP id 7F51C6B0099
	for <linux-mm@kvack.org>; Fri, 18 Oct 2024 02:41:13 -0400 (EDT)
Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay01.hostedemail.com (Postfix) with ESMTP id 155A31C6F67
	for <linux-mm@kvack.org>; Fri, 18 Oct 2024 06:41:00 +0000 (UTC)
X-FDA: 82685775690.26.FF38FE7
Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15])
	by imf27.hostedemail.com (Postfix) with ESMTP id 6686440008
	for <linux-mm@kvack.org>; Fri, 18 Oct 2024 06:41:00 +0000 (UTC)
Authentication-Results: imf27.hostedemail.com;
	dkim=pass header.d=intel.com header.s=Intel header.b=eridgetG;
	dmarc=pass (policy=none) header.from=intel.com;
	spf=pass (imf27.hostedemail.com: domain of kanchana.p.sridhar@intel.com
 designates 192.198.163.15 as permitted sender)
 smtp.mailfrom=kanchana.p.sridhar@intel.com
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729233552; a=rsa-sha256;
	cv=none;
	b=hXDeHAugcq8SPhaJ5YOQdh+DLcxee/RbIZedgOyYbgYxvawhsDaQ6b+UrPjqI33xCUU4oU
	TFHelwHBzdrlD40KJ3oqtdKbef7Nag7/jraxae08eVcB9+yfz10cEiXR0V7LkX2XW24dar
	C6TGaQAdDS9kOYEWRvYRk00w6WSFqo0=
ARC-Authentication-Results: i=1;
	imf27.hostedemail.com;
	dkim=pass header.d=intel.com header.s=Intel header.b=eridgetG;
	dmarc=pass (policy=none) header.from=intel.com;
	spf=pass (imf27.hostedemail.com: domain of kanchana.p.sridhar@intel.com
 designates 192.198.163.15 as permitted sender)
 smtp.mailfrom=kanchana.p.sridhar@intel.com
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed;
 d=hostedemail.com;
	s=arc-20220608; t=1729233552;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references:dkim-signature;
	bh=1cCwxHrWTh0Fjc9IYlgOU3POgtAU1eNjfTY1o6LL8NI=;
	b=Rov+elM/hmUikohRvXkPN/fYHDoTxgItVDXmrksFb7pQPjAGoPmRPbT/+BtmGYOBN8VALe
	4//C7K5HLgv92Mz58ortmfK0mZ+cG/IgvLkrnodzFknxM988GHUrVyhS3SwDa8ALXRqTQS
	9C3+oyrGjdqkEbIsvzxU+P4DSr5XYy4=
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;
  d=intel.com; i=@intel.com; q=dns/txt; s=Intel;
  t=1729233671; x=1760769671;
  h=from:to:cc:subject:date:message-id:in-reply-to:
   references:mime-version:content-transfer-encoding;
  bh=Qqxb4WHT8bprxcMFQO92c1NKetq9casrAYfBRMVL0vQ=;
  b=eridgetGJ+xcJU/tRREy2BvT+D40sHScrtyK3oQ4G1w/xaO3zlL450H0
   qEPUuhS+PW4OhlO9T5IOIhrjTSC6PZ6tyZ9Sbpv6qrDfzCtbpIdpws7bA
   /bxBic4nIJAt3HzldjQdivB7My3KDuXWYG0Xcx5DXq3URpIA43Xj1FDck
   ylTUXjn6FxDpwLr/FM7f4qDezwEtoQkQfJwSM2RbE+s/kQ4nj8SLp4kK/
   hMLy/lBxgQA4x5AQ0PhL6A219Y3ZuZwqClRdmqb/HEgKlzm87+LPCH2/A
   JF26O4qUImqW8Mbd7tvRkp/siONs56XfRwcibKBBhtI8Ao8ex6oGCs67f
   Q==;
X-CSE-ConnectionGUID: Zz0Bvj+OQ1SfyN2HnCP6bA==
X-CSE-MsgGUID: PW3nBt7CSbuydBlug38cgw==
X-IronPort-AV: E=McAfee;i="6700,10204,11228"; a="28884910"
X-IronPort-AV: E=Sophos;i="6.11,212,1725346800";
   d="scan'208";a="28884910"
Received: from orviesa003.jf.intel.com ([10.64.159.143])
  by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 17 Oct 2024 23:41:03 -0700
X-CSE-ConnectionGUID: KzCXCTJiQFKm7BYr/a601A==
X-CSE-MsgGUID: e8EgEA8oSM6ZXNXDpKJJHQ==
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="6.11,199,1725346800";
   d="scan'208";a="83607525"
Received: from jf5300-b11a338t.jf.intel.com ([10.242.51.6])
  by orviesa003.jf.intel.com with ESMTP; 17 Oct 2024 23:41:03 -0700
From: Kanchana P Sridhar <kanchana.p.sridhar@intel.com>
To: linux-kernel@vger.kernel.org,
	linux-mm@kvack.org,
	hannes@cmpxchg.org,
	yosryahmed@google.com,
	nphamcs@gmail.com,
	chengming.zhou@linux.dev,
	usamaarif642@gmail.com,
	ryan.roberts@arm.com,
	ying.huang@intel.com,
	21cnbao@gmail.com,
	akpm@linux-foundation.org,
	linux-crypto@vger.kernel.org,
	herbert@gondor.apana.org.au,
	davem@davemloft.net,
	clabbe@baylibre.com,
	ardb@kernel.org,
	ebiggers@google.com,
	surenb@google.com,
	kristen.c.accardi@intel.com,
	zanussi@kernel.org,
	viro@zeniv.linux.org.uk,
	brauner@kernel.org,
	jack@suse.cz,
	mcgrof@kernel.org,
	kees@kernel.org,
	joel.granados@kernel.org,
	bfoster@redhat.com,
	willy@infradead.org,
	linux-fsdevel@vger.kernel.org
Cc: wajdi.k.feghali@intel.com,
	vinodh.gopal@intel.com,
	kanchana.p.sridhar@intel.com
Subject: [RFC PATCH v1 09/13] mm: zswap: Config variable to enable compress
 batching in zswap_store().
Date: Thu, 17 Oct 2024 23:40:57 -0700
Message-Id: <20241018064101.336232-10-kanchana.p.sridhar@intel.com>
X-Mailer: git-send-email 2.27.0
In-Reply-To: <20241018064101.336232-1-kanchana.p.sridhar@intel.com>
References: <20241018064101.336232-1-kanchana.p.sridhar@intel.com>
MIME-Version: 1.0
X-Rspamd-Server: rspam12
X-Rspamd-Queue-Id: 6686440008
X-Stat-Signature: b5fe4n3nipxmj44g3pnf6jyxp1je45ht
X-Rspam-User: 
X-HE-Tag: 1729233660-59886
X-HE-Meta: 
 U2FsdGVkX19NQ3FgCUug6Zj3n8SFhNCqf7dY1e0K0WOXbdr77OjVmkK5DdkEJruuUjgYPodYLLmLLWXTVdW2dqr9YWkSHwqFJ5oQ4fpPt+/RGLW/P4HtD2khJDYAHGxlp+7gbwD2y1UBljwenWg19ljAyf8UC2ntde2AjeuESbkW/C3JGQHVVnWS0AGxWl6OQQNaTBW4Vfj3W+ubRtO1f1ZsRMjLXQyBQ3mlrP7sqghjnKpKAFdc7hAmEajSHJdISGQOiT5xinToMVWqNcY0S6gAVufOwEH5H1XMNkBt6PBfuczWXbd1g88tlsHsKGyea0aOBINVLd7Qmb7r+FaKLzAT2NsC0thI1V4al86s05U52KbNHrObWRaDyAoOLpGGwmSoiachPZPfBC0XxfrB2n/P5Ozh7ghEhr8IBlLRk4q2DDB/PeuYyl5KCWN9T81ldIPH662M/l1PC0D+dt6TTkw9R10kAsHMGSu5apZLVcp4vmgdRkxi7EbsedDYGcVfcR60n80wl7Rww7x3aflweGGe7ci7AFq/tzeTP9ZR3VjfKyXqOjr0N7VRScxMfvcKVwfytK++uMELdlYrhDy4orJFybF3WBtaYYH7C6x2/LCrFAVWKD8j+swYRla4FDaCTPPaz+Cvmg51C8Wc7/aOYaf8K45cL8nax2wOjNVBQxju8gbMil3kA2dZeTno2vzTykgncqhe63GO5LMM+b8NJbT12Yedevav1DM9zUwrRJO9h8z+mpPmA/kMiWGbcqbtg5VD75B7xms7/UU84+wVRfJuClBWlnHgQzbF1q1fTusETncvxoeb5teW5hG57oxu8p9joZI4qHHylD8Wd5H6N+NA1KBbTsMInbrRss+BclKFixX4De9sTWuA44XO8DZbqgYE1IiNqQoSefFcm6ZPsUEzRiApygbdPdvd/QaTSe+jTgBnL7pGAlgRKvbobYRT1/kUPJaI4IcagNNaFUB
 Nx6TrGpC
 6c8nuBsT8448FEJJecOV1A6CpXfzdJjsT53wWGaFbWjyIbVSUq3G6YRScO0OAmp0ieivNX5s7JzPF7+3reViUhmH9YxLC/UGpWye18g9nC2EBW7MpG85JfKKVeXw9AN5jW0wHAONKfLoZ6R6wMTlLmJZbpIRJ1yW5YsiMahRNM9/lS0idpASGFEjlNFVPZ8SriGURXpyKDm2F7/30pmns/QHOHpffkGGjeONE
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>
List-Subscribe: <mailto:majordomo@kvack.org>
List-Unsubscribe: <mailto:majordomo@kvack.org>

Add a new zswap config variable that controls whether zswap_store() will
compress a batch of pages, for instance, the pages in a large folio:

  CONFIG_ZSWAP_STORE_BATCHING_ENABLED

The existing CONFIG_CRYPTO_DEV_IAA_CRYPTO variable added in commit
ea7a5cbb4369 ("crypto: iaa - Add Intel IAA Compression Accelerator crypto
driver core") is used to detect if the system has the Intel Analytics
Accelerator (IAA), and the iaa_crypto module is available. If so, the
kernel build will prompt for CONFIG_ZSWAP_STORE_BATCHING_ENABLED. Hence,
users have the ability to set CONFIG_ZSWAP_STORE_BATCHING_ENABLED="y" only
on systems that have Intel IAA.

If CONFIG_ZSWAP_STORE_BATCHING_ENABLED is enabled, and IAA is configured
as the zswap compressor, zswap_store() will process the pages in a large
folio in batches, i.e., multiple pages at a time. Pages in a batch will be
compressed in parallel in hardware, then stored. On systems without Intel
IAA and/or if zswap uses software compressors, pages in the batch will be
compressed sequentially and stored.

The patch also implements a zswap API that returns the status of this
config variable.

Suggested-by: Ying Huang <ying.huang@intel.com>
Signed-off-by: Kanchana P Sridhar <kanchana.p.sridhar@intel.com>
---
 include/linux/zswap.h |  6 ++++++
 mm/Kconfig            | 12 ++++++++++++
 mm/zswap.c            | 14 ++++++++++++++
 3 files changed, 32 insertions(+)

diff --git a/include/linux/zswap.h b/include/linux/zswap.h
index d961ead91bf1..74ad2a24b309 100644
--- a/include/linux/zswap.h
+++ b/include/linux/zswap.h
@@ -24,6 +24,7 @@ struct zswap_lruvec_state {
 	atomic_long_t nr_disk_swapins;
 };
 
+bool zswap_store_batching_enabled(void);
 unsigned long zswap_total_pages(void);
 bool zswap_store(struct folio *folio);
 bool zswap_load(struct folio *folio);
@@ -39,6 +40,11 @@ bool zswap_never_enabled(void);
 
 struct zswap_lruvec_state {};
 
+static inline bool zswap_store_batching_enabled(void)
+{
+	return false;
+}
+
 static inline bool zswap_store(struct folio *folio)
 {
 	return false;
diff --git a/mm/Kconfig b/mm/Kconfig
index 33fa51d608dc..26d1a5cee471 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -125,6 +125,18 @@ config ZSWAP_COMPRESSOR_DEFAULT
        default "zstd" if ZSWAP_COMPRESSOR_DEFAULT_ZSTD
        default ""
 
+config ZSWAP_STORE_BATCHING_ENABLED
+	bool "Batching of zswap stores with Intel IAA"
+	depends on ZSWAP && CRYPTO_DEV_IAA_CRYPTO
+	default n
+	help
+	Enables zswap_store to swapout large folios in batches of 8 pages,
+	rather than a page at a time, if the system has Intel IAA for hardware
+	acceleration of compressions. If IAA is configured as the zswap
+	compressor, this will parallelize batch compression of upto 8 pages
+	in the folio in	hardware, thereby improving large folio compression
+	throughput and reducing swapout latency.
+
 choice
 	prompt "Default allocator"
 	depends on ZSWAP
diff --git a/mm/zswap.c b/mm/zswap.c
index 948c9745ee57..4893302d8c34 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -127,6 +127,15 @@ static bool zswap_shrinker_enabled = IS_ENABLED(
 		CONFIG_ZSWAP_SHRINKER_DEFAULT_ON);
 module_param_named(shrinker_enabled, zswap_shrinker_enabled, bool, 0644);
 
+/*
+ * Enable/disable batching of compressions if zswap_store is called with a
+ * large folio. If enabled, and if IAA is the zswap compressor, pages are
+ * compressed in parallel in batches of say, 8 pages.
+ * If not, every page is compressed sequentially.
+ */
+static bool __zswap_store_batching_enabled = IS_ENABLED(
+	CONFIG_ZSWAP_STORE_BATCHING_ENABLED);
+
 bool zswap_is_enabled(void)
 {
 	return zswap_enabled;
@@ -241,6 +250,11 @@ static inline struct xarray *swap_zswap_tree(swp_entry_t swp)
 	pr_debug("%s pool %s/%s\n", msg, (p)->tfm_name,		\
 		 zpool_get_type((p)->zpool))
 
+__always_inline bool zswap_store_batching_enabled(void)
+{
+	return __zswap_store_batching_enabled;
+}
+
 /*********************************
 * pool functions
 **********************************/

From patchwork Fri Oct 18 06:40:58 2024
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: "Sridhar, Kanchana P" <kanchana.p.sridhar@intel.com>
X-Patchwork-Id: 13841237
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id BE416D3C552
	for <linux-mm@archiver.kernel.org>; Fri, 18 Oct 2024 06:41:30 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id 294806B009B; Fri, 18 Oct 2024 02:41:14 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 15ABE6B009E; Fri, 18 Oct 2024 02:41:14 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id CF0DB6B009B; Fri, 18 Oct 2024 02:41:13 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com
 [216.40.44.17])
	by kanga.kvack.org (Postfix) with ESMTP id ACE556B009C
	for <linux-mm@kvack.org>; Fri, 18 Oct 2024 02:41:13 -0400 (EDT)
Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay08.hostedemail.com (Postfix) with ESMTP id 45113140599
	for <linux-mm@kvack.org>; Fri, 18 Oct 2024 06:41:01 +0000 (UTC)
X-FDA: 82685775774.16.7FACE7F
Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15])
	by imf24.hostedemail.com (Postfix) with ESMTP id 2C54918000E
	for <linux-mm@kvack.org>; Fri, 18 Oct 2024 06:41:08 +0000 (UTC)
Authentication-Results: imf24.hostedemail.com;
	dkim=pass header.d=intel.com header.s=Intel header.b=c4HrQAdh;
	dmarc=pass (policy=none) header.from=intel.com;
	spf=pass (imf24.hostedemail.com: domain of kanchana.p.sridhar@intel.com
 designates 192.198.163.15 as permitted sender)
 smtp.mailfrom=kanchana.p.sridhar@intel.com
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729233563; a=rsa-sha256;
	cv=none;
	b=Ya0xaFLpdgr6rgZW/Zn0aiY6TlFncUyqFY9okVZA+zc/REzc4BoDwHMfGWjYIIwKZf8ydV
	AgvJH8v9xWP1cxy2kkHFmtSxfF9EUGZ+Fba/UjahLKQzZm5JIIpczgup5sxjY+/1sSAKb8
	/Zey4FKckPYHsJf651apYniAsO+YXxE=
ARC-Authentication-Results: i=1;
	imf24.hostedemail.com;
	dkim=pass header.d=intel.com header.s=Intel header.b=c4HrQAdh;
	dmarc=pass (policy=none) header.from=intel.com;
	spf=pass (imf24.hostedemail.com: domain of kanchana.p.sridhar@intel.com
 designates 192.198.163.15 as permitted sender)
 smtp.mailfrom=kanchana.p.sridhar@intel.com
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed;
 d=hostedemail.com;
	s=arc-20220608; t=1729233563;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references:dkim-signature;
	bh=defscYOWfMfR/FarNwYsJbugfsxcXochzS4X58iXci4=;
	b=AiPzwWROfRZN7AoET62Q9MKrsAmpj2bUAuySnePmKcIPRA8e/mGkaFecJaiWRVFRAjcuGs
	Rhu5TeuPrzvuY3HQdeyifq29o/xQAgZ/JBlUWOApYcULzGzZ35rACNQj59islXLxM7fExg
	biarcrHjcV2mJquo1Q/tQ/4akWwKNMI=
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;
  d=intel.com; i=@intel.com; q=dns/txt; s=Intel;
  t=1729233671; x=1760769671;
  h=from:to:cc:subject:date:message-id:in-reply-to:
   references:mime-version:content-transfer-encoding;
  bh=+Q7iJvK2+hp6eB2kPVjEewMUFUZIdF/DxKq670u2u9s=;
  b=c4HrQAdh8PHiBZc9vInMg2u/WoTGEwKNDLOr4fprJv5oYrLkncma2IIT
   YKHwN1NFYLBdBy/w7jiPTJLLB9aXRl7lNndqkY05E5fd9iUaHda0aXcRE
   m7qunqiPqKxXITO5ca9YOzFsiTZXUHloVtrDneIoWc+MdnQMrvMSgVYdK
   XNFNuU5kdNK1/X02hryCodUwizheUZKlZBzv6V1hWKZ8U4d/VmTDHb5A6
   N2bxzuUCv+AQ8MXWhqKhHk94DB1IwQmoMSKeYH8N4bVwrkM7PrNnvI63G
   eTU984AMSifOcgvBt0HKl9eia22Jk30/JNvqQDki3cZ7Dbh5r1zJcFQ8G
   g==;
X-CSE-ConnectionGUID: igG5AbQ7Skymr8GnwnKdag==
X-CSE-MsgGUID: UP2W574WQDirdpliBU7FJg==
X-IronPort-AV: E=McAfee;i="6700,10204,11228"; a="28884928"
X-IronPort-AV: E=Sophos;i="6.11,212,1725346800";
   d="scan'208";a="28884928"
Received: from orviesa003.jf.intel.com ([10.64.159.143])
  by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 17 Oct 2024 23:41:04 -0700
X-CSE-ConnectionGUID: 8bTgRIQISyO43aali6d8vQ==
X-CSE-MsgGUID: bqFAXIkNSPSN+aHIWJjjpw==
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="6.11,199,1725346800";
   d="scan'208";a="83607529"
Received: from jf5300-b11a338t.jf.intel.com ([10.242.51.6])
  by orviesa003.jf.intel.com with ESMTP; 17 Oct 2024 23:41:04 -0700
From: Kanchana P Sridhar <kanchana.p.sridhar@intel.com>
To: linux-kernel@vger.kernel.org,
	linux-mm@kvack.org,
	hannes@cmpxchg.org,
	yosryahmed@google.com,
	nphamcs@gmail.com,
	chengming.zhou@linux.dev,
	usamaarif642@gmail.com,
	ryan.roberts@arm.com,
	ying.huang@intel.com,
	21cnbao@gmail.com,
	akpm@linux-foundation.org,
	linux-crypto@vger.kernel.org,
	herbert@gondor.apana.org.au,
	davem@davemloft.net,
	clabbe@baylibre.com,
	ardb@kernel.org,
	ebiggers@google.com,
	surenb@google.com,
	kristen.c.accardi@intel.com,
	zanussi@kernel.org,
	viro@zeniv.linux.org.uk,
	brauner@kernel.org,
	jack@suse.cz,
	mcgrof@kernel.org,
	kees@kernel.org,
	joel.granados@kernel.org,
	bfoster@redhat.com,
	willy@infradead.org,
	linux-fsdevel@vger.kernel.org
Cc: wajdi.k.feghali@intel.com,
	vinodh.gopal@intel.com,
	kanchana.p.sridhar@intel.com
Subject: [RFC PATCH v1 10/13] mm: zswap: Create multiple reqs/buffers in
 crypto_acomp_ctx if platform has IAA.
Date: Thu, 17 Oct 2024 23:40:58 -0700
Message-Id: <20241018064101.336232-11-kanchana.p.sridhar@intel.com>
X-Mailer: git-send-email 2.27.0
In-Reply-To: <20241018064101.336232-1-kanchana.p.sridhar@intel.com>
References: <20241018064101.336232-1-kanchana.p.sridhar@intel.com>
MIME-Version: 1.0
X-Rspamd-Queue-Id: 2C54918000E
X-Rspam-User: 
X-Rspamd-Server: rspam05
X-Stat-Signature: dbspdwtowbam193g8fwxe3yb5ocp4qk7
X-HE-Tag: 1729233668-944185
X-HE-Meta: 
 U2FsdGVkX1/qe7lWs2enivnBeGWpMO71tMUM06Ts97E/AzY8f2DZGX8/UpW+LOLtNrITC+B8egaJhcJuo0LAJFswpQWLmqQrc8EPfnjXCQeKrWw1nWgxn/GBiX0+Nv3PMna8A7WJnC92EaT6To8bA/PCnCv5ekG81PYdykjmC+0im6/O5GRBoKztcIg505937XAlrKifVfRcdd/+Nquz68jv73MhtktpVHupB8kBh10EZ3FwUh/Tmove+oMN7rfQwKldDYnzA5n1p4H54CzC9qF7gH7GyiSOLNxDGjgQddxQJYeoOseiVMnMkl9XqKXzd6azorFkDkdotkGo9agPclJurP4vXauMlANQGdNsxZ9ob1K6T5PgprfW2BZuFL8AU2tfvuB0VLlK+YG4bJ6+Ol/xZH9rpwa/pG59inACzUqwKcnsacOkXPQe2+qvOyZyQWftWpQEdrHnKGdM8+/Bsfc+affX+tgrgaBHFVuWGAJ+D+0dafIjRTlm36MiXIiddFB3y46lIsQSgx86+mCsuJRAwpWcMkD8jOWRBUtGxXjtMhxnsYVwLmaNeSwdZrMoJNj0VEASagKduTfRzmtMW6trX9fRhC9s3mPR1x2OAhcj7/kFZ8DxJEjMyjw9HDv5iJiBmsnIrVVUwfwEAmyAQpgjduV+KQbF3t6xBmB1Jy5SFgxj+XMRsXBoxY3aeYuKcelL/czavXFn3FpRYrwVCObDUADIiiu45v5JHK3jvV0uooOehwFnyWEGRmAWajzRlUaA4pmnPJzG70B/9aiNYP6bX4QPfEdRRBqw87t7oeWuJvJhGDb/FXht+L25TOczJCwg5m+HgbGlXgSYSk0foN0/f1Q4VfJ3WzPmQLWW+RwllO7D+Hz0LUCUd+g2aHzkRhQaB1ZVTQ7flva3Dy5d3j6383J+hhlRKYUQdhqx/nTU3WoNrEFsRK679uzPGoaCkQGtcbyDE9raZIlDLAW
 rkteqcjK
 rMmXbmR6ngbNq9HAUdqIvrIdflLvXbSUurMMh0hYLuowHgoVd4tv0Suei1ZirvRqNRU+wt70dd7VgY+jFvfzdqbgU1OOt1xcya9vqrp4mTMfZgPK2tYvZkBV2NaOKNMTkcGViF+mGLGoTVBeTqbIOL2pkHcMDsHEFmQDDOMC9fJ83bZn2RiS8vRQ0mZOsXjFg90iHNJvj8QMDUkXlTS1nztdFdUWo9psLVWwSKl8ZytpanSg=
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>
List-Subscribe: <mailto:majordomo@kvack.org>
List-Unsubscribe: <mailto:majordomo@kvack.org>

Intel IAA hardware acceleration can be used effectively to improve the
zswap_store() performance of large folios by batching multiple pages in a
folio to be compressed in parallel by IAA. Hence, to build compress batching
of zswap large folio stores using IAA, we need to be able to submit a batch
of compress jobs from zswap to the hardware to compress in parallel if the
iaa_crypto "async" mode is used.

The IAA compress batching paradigm works as follows:

 1) Submit N crypto_acomp_compress() jobs using N requests.
 2) Use the iaa_crypto driver async poll() method to check for the jobs
    to complete.
 3) There are no ordering constraints implied by submission, hence we
    could loop through the requests and process any job that has
    completed.
 4) This would repeat until all jobs have completed with success/error
    status.

To facilitate this, we need to provide for multiple acomp_reqs in
"struct crypto_acomp_ctx", each representing a distinct compress
job. Likewise, there needs to be a distinct destination buffer
corresponding to each acomp_req.

If CONFIG_ZSWAP_STORE_BATCHING_ENABLED is enabled, this patch will set the
SWAP_CRYPTO_SUB_BATCH_SIZE constant to 8UL. This implies each per-cpu
crypto_acomp_ctx associated with the zswap_pool can submit up to 8
acomp_reqs at a time to accomplish parallel compressions.

If IAA is not present and/or CONFIG_ZSWAP_STORE_BATCHING_ENABLED is not
set, SWAP_CRYPTO_SUB_BATCH_SIZE will be set to 1UL.

On an Intel Sapphire Rapids server, each socket has 4 IAA, each of which
has 2 compress engines and 8 decompress engines. Experiments modeling a
contended system with say 72 processes running under a cgroup with a fixed
memory-limit, have shown that there is a significant performance
improvement with dispatching compress jobs from all cores to all the
IAA devices on the socket. Hence, SWAP_CRYPTO_SUB_BATCH_SIZE is set to
8 to maximize compression throughput if IAA is available.

The definition of "struct crypto_acomp_ctx" is modified to make the
req/buffer be arrays of size SWAP_CRYPTO_SUB_BATCH_SIZE. Thus, the
added memory footprint cost of this per-cpu structure for batching is
incurred only for platforms that have Intel IAA.

Suggested-by: Ying Huang <ying.huang@intel.com>
Signed-off-by: Kanchana P Sridhar <kanchana.p.sridhar@intel.com>
---
 mm/swap.h  |  11 ++++++
 mm/zswap.c | 104 ++++++++++++++++++++++++++++++++++-------------------
 2 files changed, 78 insertions(+), 37 deletions(-)

diff --git a/mm/swap.h b/mm/swap.h
index ad2f121de970..566616c971d4 100644
--- a/mm/swap.h
+++ b/mm/swap.h
@@ -8,6 +8,17 @@ struct mempolicy;
 #include <linux/swapops.h> /* for swp_offset */
 #include <linux/blk_types.h> /* for bio_end_io_t */
 
+/*
+ * For IAA compression batching:
+ * Maximum number of IAA acomp compress requests that will be processed
+ * in a sub-batch.
+ */
+#if defined(CONFIG_ZSWAP_STORE_BATCHING_ENABLED)
+#define SWAP_CRYPTO_SUB_BATCH_SIZE 8UL
+#else
+#define SWAP_CRYPTO_SUB_BATCH_SIZE 1UL
+#endif
+
 /* linux/mm/page_io.c */
 int sio_pool_init(void);
 struct swap_iocb;
diff --git a/mm/zswap.c b/mm/zswap.c
index 4893302d8c34..579869d1bdf6 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -152,9 +152,9 @@ bool zswap_never_enabled(void)
 
 struct crypto_acomp_ctx {
 	struct crypto_acomp *acomp;
-	struct acomp_req *req;
+	struct acomp_req *req[SWAP_CRYPTO_SUB_BATCH_SIZE];
+	u8 *buffer[SWAP_CRYPTO_SUB_BATCH_SIZE];
 	struct crypto_wait wait;
-	u8 *buffer;
 	struct mutex mutex;
 	bool is_sleepable;
 };
@@ -832,49 +832,64 @@ static int zswap_cpu_comp_prepare(unsigned int cpu, struct hlist_node *node)
 	struct zswap_pool *pool = hlist_entry(node, struct zswap_pool, node);
 	struct crypto_acomp_ctx *acomp_ctx = per_cpu_ptr(pool->acomp_ctx, cpu);
 	struct crypto_acomp *acomp;
-	struct acomp_req *req;
 	int ret;
+	int i, j;
 
 	mutex_init(&acomp_ctx->mutex);
 
-	acomp_ctx->buffer = kmalloc_node(PAGE_SIZE * 2, GFP_KERNEL, cpu_to_node(cpu));
-	if (!acomp_ctx->buffer)
-		return -ENOMEM;
-
 	acomp = crypto_alloc_acomp_node(pool->tfm_name, 0, 0, cpu_to_node(cpu));
 	if (IS_ERR(acomp)) {
 		pr_err("could not alloc crypto acomp %s : %ld\n",
 				pool->tfm_name, PTR_ERR(acomp));
-		ret = PTR_ERR(acomp);
-		goto acomp_fail;
+		return PTR_ERR(acomp);
 	}
 	acomp_ctx->acomp = acomp;
 	acomp_ctx->is_sleepable = acomp_is_async(acomp);
 
-	req = acomp_request_alloc(acomp_ctx->acomp);
-	if (!req) {
-		pr_err("could not alloc crypto acomp_request %s\n",
-		       pool->tfm_name);
-		ret = -ENOMEM;
-		goto req_fail;
+	for (i = 0; i < SWAP_CRYPTO_SUB_BATCH_SIZE; ++i) {
+		acomp_ctx->buffer[i] = kmalloc_node(PAGE_SIZE * 2,
+						GFP_KERNEL, cpu_to_node(cpu));
+		if (!acomp_ctx->buffer[i]) {
+			for (j = 0; j < i; ++j)
+				kfree(acomp_ctx->buffer[j]);
+			ret = -ENOMEM;
+			goto buf_fail;
+		}
+	}
+
+	for (i = 0; i < SWAP_CRYPTO_SUB_BATCH_SIZE; ++i) {
+		acomp_ctx->req[i] = acomp_request_alloc(acomp_ctx->acomp);
+		if (!acomp_ctx->req[i]) {
+			pr_err("could not alloc crypto acomp_request req[%d] %s\n",
+			       i, pool->tfm_name);
+			for (j = 0; j < i; ++j)
+				acomp_request_free(acomp_ctx->req[j]);
+			ret = -ENOMEM;
+			goto req_fail;
+		}
 	}
-	acomp_ctx->req = req;
 
+	/*
+	 * The crypto_wait is used only in fully synchronous, i.e., with scomp
+	 * or non-poll mode of acomp, hence there is only one "wait" per
+	 * acomp_ctx, with callback set to req[0].
+	 */
 	crypto_init_wait(&acomp_ctx->wait);
 	/*
 	 * if the backend of acomp is async zip, crypto_req_done() will wakeup
 	 * crypto_wait_req(); if the backend of acomp is scomp, the callback
 	 * won't be called, crypto_wait_req() will return without blocking.
 	 */
-	acomp_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG,
+	acomp_request_set_callback(acomp_ctx->req[0], CRYPTO_TFM_REQ_MAY_BACKLOG,
 				   crypto_req_done, &acomp_ctx->wait);
 
 	return 0;
 
 req_fail:
+	for (i = 0; i < SWAP_CRYPTO_SUB_BATCH_SIZE; ++i)
+		kfree(acomp_ctx->buffer[i]);
+buf_fail:
 	crypto_free_acomp(acomp_ctx->acomp);
-acomp_fail:
-	kfree(acomp_ctx->buffer);
 	return ret;
 }
 
@@ -884,11 +899,17 @@ static int zswap_cpu_comp_dead(unsigned int cpu, struct hlist_node *node)
 	struct crypto_acomp_ctx *acomp_ctx = per_cpu_ptr(pool->acomp_ctx, cpu);
 
 	if (!IS_ERR_OR_NULL(acomp_ctx)) {
-		if (!IS_ERR_OR_NULL(acomp_ctx->req))
-			acomp_request_free(acomp_ctx->req);
+		int i;
+
+		for (i = 0; i < SWAP_CRYPTO_SUB_BATCH_SIZE; ++i)
+			if (!IS_ERR_OR_NULL(acomp_ctx->req[i]))
+				acomp_request_free(acomp_ctx->req[i]);
+
+		for (i = 0; i < SWAP_CRYPTO_SUB_BATCH_SIZE; ++i)
+			kfree(acomp_ctx->buffer[i]);
+
 		if (!IS_ERR_OR_NULL(acomp_ctx->acomp))
 			crypto_free_acomp(acomp_ctx->acomp);
-		kfree(acomp_ctx->buffer);
 	}
 
 	return 0;
@@ -911,7 +932,7 @@ static bool zswap_compress(struct page *page, struct zswap_entry *entry,
 
 	mutex_lock(&acomp_ctx->mutex);
 
-	dst = acomp_ctx->buffer;
+	dst = acomp_ctx->buffer[0];
 	sg_init_table(&input, 1);
 	sg_set_page(&input, page, PAGE_SIZE, 0);
 
@@ -921,7 +942,7 @@ static bool zswap_compress(struct page *page, struct zswap_entry *entry,
 	 * giving the dst buffer with enough length to avoid buffer overflow.
 	 */
 	sg_init_one(&output, dst, PAGE_SIZE * 2);
-	acomp_request_set_params(acomp_ctx->req, &input, &output, PAGE_SIZE, dlen);
+	acomp_request_set_params(acomp_ctx->req[0], &input, &output, PAGE_SIZE, dlen);
 
 	/*
 	 * If the crypto_acomp provides an asynchronous poll() interface,
@@ -940,19 +961,20 @@ static bool zswap_compress(struct page *page, struct zswap_entry *entry,
 	 * parallel.
 	 */
 	if (acomp_ctx->acomp->poll) {
-		comp_ret = crypto_acomp_compress(acomp_ctx->req);
+		comp_ret = crypto_acomp_compress(acomp_ctx->req[0]);
 		if (comp_ret == -EINPROGRESS) {
 			do {
-				comp_ret = crypto_acomp_poll(acomp_ctx->req);
+				comp_ret = crypto_acomp_poll(acomp_ctx->req[0]);
 				if (comp_ret && comp_ret != -EAGAIN)
 					break;
 			} while (comp_ret);
 		}
 	} else {
-		comp_ret = crypto_wait_req(crypto_acomp_compress(acomp_ctx->req), &acomp_ctx->wait);
+		comp_ret = crypto_wait_req(crypto_acomp_compress(acomp_ctx->req[0]),
+					   &acomp_ctx->wait);
 	}
 
-	dlen = acomp_ctx->req->dlen;
+	dlen = acomp_ctx->req[0]->dlen;
 	if (comp_ret)
 		goto unlock;
 
@@ -1006,31 +1028,39 @@ static void zswap_decompress(struct zswap_entry *entry, struct folio *folio)
 	 */
 	if ((acomp_ctx->is_sleepable && !zpool_can_sleep_mapped(zpool)) ||
 	    !virt_addr_valid(src)) {
-		memcpy(acomp_ctx->buffer, src, entry->length);
-		src = acomp_ctx->buffer;
+		memcpy(acomp_ctx->buffer[0], src, entry->length);
+		src = acomp_ctx->buffer[0];
 		zpool_unmap_handle(zpool, entry->handle);
 	}
 
 	sg_init_one(&input, src, entry->length);
 	sg_init_table(&output, 1);
 	sg_set_folio(&output, folio, PAGE_SIZE, 0);
-	acomp_request_set_params(acomp_ctx->req, &input, &output, entry->length, PAGE_SIZE);
+	acomp_request_set_params(acomp_ctx->req[0], &input, &output,
+				 entry->length, PAGE_SIZE);
 	if (acomp_ctx->acomp->poll) {
-		ret = crypto_acomp_decompress(acomp_ctx->req);
+		ret = crypto_acomp_decompress(acomp_ctx->req[0]);
 		if (ret == -EINPROGRESS) {
 			do {
-				ret = crypto_acomp_poll(acomp_ctx->req);
+				ret = crypto_acomp_poll(acomp_ctx->req[0]);
 				BUG_ON(ret && ret != -EAGAIN);
 			} while (ret);
 		}
 	} else {
-		BUG_ON(crypto_wait_req(crypto_acomp_decompress(acomp_ctx->req), &acomp_ctx->wait));
+		BUG_ON(crypto_wait_req(crypto_acomp_decompress(acomp_ctx->req[0]),
+				       &acomp_ctx->wait));
 	}
-	BUG_ON(acomp_ctx->req->dlen != PAGE_SIZE);
-	mutex_unlock(&acomp_ctx->mutex);
+	BUG_ON(acomp_ctx->req[0]->dlen != PAGE_SIZE);
 
-	if (src != acomp_ctx->buffer)
+	if (src != acomp_ctx->buffer[0])
 		zpool_unmap_handle(zpool, entry->handle);
+
+	/*
+	 * It is safer to unlock the mutex after the check for
+	 * "src != acomp_ctx->buffer[0]" so that the value of "src"
+	 * does not change.
+	 */
+	mutex_unlock(&acomp_ctx->mutex);
 }
 
 /*********************************

From patchwork Fri Oct 18 06:40:59 2024
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: "Sridhar, Kanchana P" <kanchana.p.sridhar@intel.com>
X-Patchwork-Id: 13841238
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id EED21D3C550
	for <linux-mm@archiver.kernel.org>; Fri, 18 Oct 2024 06:41:33 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id 7F6CA6B009C; Fri, 18 Oct 2024 02:41:15 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 6ADFA6B009D; Fri, 18 Oct 2024 02:41:15 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 501AB6B009E; Fri, 18 Oct 2024 02:41:15 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com
 [216.40.44.11])
	by kanga.kvack.org (Postfix) with ESMTP id 27A566B009C
	for <linux-mm@kvack.org>; Fri, 18 Oct 2024 02:41:15 -0400 (EDT)
Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay06.hostedemail.com (Postfix) with ESMTP id 957CCAC799
	for <linux-mm@kvack.org>; Fri, 18 Oct 2024 06:40:51 +0000 (UTC)
X-FDA: 82685775900.04.998BD6A
Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15])
	by imf19.hostedemail.com (Postfix) with ESMTP id 6610D1A0017
	for <linux-mm@kvack.org>; Fri, 18 Oct 2024 06:40:59 +0000 (UTC)
Authentication-Results: imf19.hostedemail.com;
	dkim=pass header.d=intel.com header.s=Intel header.b=GW+Sqmmg;
	spf=pass (imf19.hostedemail.com: domain of kanchana.p.sridhar@intel.com
 designates 192.198.163.15 as permitted sender)
 smtp.mailfrom=kanchana.p.sridhar@intel.com;
	dmarc=pass (policy=none) header.from=intel.com
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed;
 d=hostedemail.com;
	s=arc-20220608; t=1729233526;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references:dkim-signature;
	bh=6oPOptHy1TMVMHa9IycIcMocFy/J8hGQChnYgpe95JU=;
	b=iH8OFAYu/pTSaA97A3o+Jmm8Sx08DtvbfWO7SsBNUODtVoVpeognuYmbJg+snOTwB576vs
	9h5osZl1KA9vl4R1mpGjzFZD3BVkdqkqhOrw51VqQ9bhrcJdZyvn75TkEFerdPos+tc4Tm
	fDo3GJ7T3GS6pfdud4hfOnTgv1S0nfg=
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729233526; a=rsa-sha256;
	cv=none;
	b=mrjcO9u8xWiA5eJdTBqKWojS9rSU09np/sUJ8oQyc45zm+v411nyafRGejqZKjudgWzx16
	NQnxh7lyZocSsd93+IUAKtRPk4ssmsw8hLaKg10463FEHYiSU2/8wpbU78OPe8ieH03d61
	ezD4id3VynH+ewuHa1Itzk6Vu6P+0As=
ARC-Authentication-Results: i=1;
	imf19.hostedemail.com;
	dkim=pass header.d=intel.com header.s=Intel header.b=GW+Sqmmg;
	spf=pass (imf19.hostedemail.com: domain of kanchana.p.sridhar@intel.com
 designates 192.198.163.15 as permitted sender)
 smtp.mailfrom=kanchana.p.sridhar@intel.com;
	dmarc=pass (policy=none) header.from=intel.com
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;
  d=intel.com; i=@intel.com; q=dns/txt; s=Intel;
  t=1729233673; x=1760769673;
  h=from:to:cc:subject:date:message-id:in-reply-to:
   references:mime-version:content-transfer-encoding;
  bh=ehxQAVad+zYWKv7x3rvdIyaL5PCorY7OSY9LU2CsBiY=;
  b=GW+SqmmgFLFx/4YeKFMM3+tXB/2lkQqQBdZT10ZFlkYeegpzvG7Ei27D
   ycpChiXpZdim3Sekweg75+v9rEt0hQDI5rkQFUOzKWYzP8/jHM53K4QIE
   f3439H5uT1X1JVRwnGi9FiGgSROb1JyA/DaGbs3ZNSiUmUxcrwdOB3MzD
   I4DrAJzBq6mTaf6AnKSex8AlfnH49f0Vqs5WfGxvJ5BaQIEoExCJbqaJt
   PQoHBVkz2Lfz/sfM1wh2QkXznVxjTJ+V3xv4EOS9Ewjo6Z699g0CDIaMT
   dkGas68tJDW2KromvvTspJz8tozMEKx5YHR/mPLZrNaqSwUz5FyBLBnWh
   g==;
X-CSE-ConnectionGUID: tcKG0LrkTnmmAYeN/6OLbg==
X-CSE-MsgGUID: IDQq8eNRStClVi38JsohKw==
X-IronPort-AV: E=McAfee;i="6700,10204,11228"; a="28884948"
X-IronPort-AV: E=Sophos;i="6.11,212,1725346800";
   d="scan'208";a="28884948"
Received: from orviesa003.jf.intel.com ([10.64.159.143])
  by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 17 Oct 2024 23:41:04 -0700
X-CSE-ConnectionGUID: PBvCfG+vRvKunH74mxz5gg==
X-CSE-MsgGUID: nJbh3AsKSIKgmJHgOz/6cA==
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="6.11,199,1725346800";
   d="scan'208";a="83607534"
Received: from jf5300-b11a338t.jf.intel.com ([10.242.51.6])
  by orviesa003.jf.intel.com with ESMTP; 17 Oct 2024 23:41:04 -0700
From: Kanchana P Sridhar <kanchana.p.sridhar@intel.com>
To: linux-kernel@vger.kernel.org,
	linux-mm@kvack.org,
	hannes@cmpxchg.org,
	yosryahmed@google.com,
	nphamcs@gmail.com,
	chengming.zhou@linux.dev,
	usamaarif642@gmail.com,
	ryan.roberts@arm.com,
	ying.huang@intel.com,
	21cnbao@gmail.com,
	akpm@linux-foundation.org,
	linux-crypto@vger.kernel.org,
	herbert@gondor.apana.org.au,
	davem@davemloft.net,
	clabbe@baylibre.com,
	ardb@kernel.org,
	ebiggers@google.com,
	surenb@google.com,
	kristen.c.accardi@intel.com,
	zanussi@kernel.org,
	viro@zeniv.linux.org.uk,
	brauner@kernel.org,
	jack@suse.cz,
	mcgrof@kernel.org,
	kees@kernel.org,
	joel.granados@kernel.org,
	bfoster@redhat.com,
	willy@infradead.org,
	linux-fsdevel@vger.kernel.org
Cc: wajdi.k.feghali@intel.com,
	vinodh.gopal@intel.com,
	kanchana.p.sridhar@intel.com
Subject: [RFC PATCH v1 11/13] mm: swap: Add IAA batch compression API
 swap_crypto_acomp_compress_batch().
Date: Thu, 17 Oct 2024 23:40:59 -0700
Message-Id: <20241018064101.336232-12-kanchana.p.sridhar@intel.com>
X-Mailer: git-send-email 2.27.0
In-Reply-To: <20241018064101.336232-1-kanchana.p.sridhar@intel.com>
References: <20241018064101.336232-1-kanchana.p.sridhar@intel.com>
MIME-Version: 1.0
X-Stat-Signature: 65zoahs1fmzttkcjd5pwafuebd5m1gtg
X-Rspamd-Queue-Id: 6610D1A0017
X-Rspam-User: 
X-Rspamd-Server: rspam08
X-HE-Tag: 1729233659-141190
X-HE-Meta: 
 U2FsdGVkX192qVtzymKqQcR2mti3OhsAvec5wFGtda+r2qfX4tMGXNff4av9HP7PsoJpDH4jRFQq+E5RVmoioEZiL8cIcKK4cN282sgVk0KXZhk5ct4Cv40lz58T/v2YvtHJKflOuKA7j9vPESOtdBoxSs538TySdmYcOQ1hb847hSXZUkDhgFP+h0VDDKh74VeaaH6rTSg/Ipbnpl2kCfSlMe/0b8Gie3+bsohmbOxYRX32P4ngAxniSqcj3D7TtNwn860s58RtCz/+fCp0V1Hq5Kl7xS93VdBxsoQ93TB4qpXYTr/aEElgcu62RLSl8blNbJD0pJCnm5jrgMaffqypx34ZFCNCBDuzdgzwjU6I0aFwYRs0jdV7rp2WdjGHpTHXp7TDyHPIpOu3Z6PJaFTKgiRMggs2UdvHAui429HPVUeuvy4Q7O8ievJGNrNAUdbS5GDXFFmeFyp4otej/UHsvAEjDpVmxFvzydsx8NGsz2Sd1RkO73mUHQOEdoPhLIT2AlkmVrMKljKdzxmPd2+iGY15QTWgA1IWyWkflyfT25Y2ED9mQY/ehqQFlpJSOYzoeQcaAcIjEykujHBAU7Y2yBOZ8xRaW1jF62aAjweck/fE37e5GUo0WWU4zf9ikxpEdlhqs31vmZtgSfAqnf+Oq/0UGtxa3M66/5hw+kl0pQabU2wrJwIE6Oqs+EcwE3aE8hkWi3c77ARUolPz4Lj8lve+9yZLkWWYuDLCLiuquIFbwHlEE1GHINGMx/CcX4a3knKSISEIWenWdTE8CKssV9GdWbjyZn7qvcUBfNBzBmxub9DrESYHu0DHfrbTmhU0LxIUa5SGK3sHDOb8qKkD7z90FuOeOzSGkIRi1NgVR2LhethCXoUccDTrwD+jEyyj9uvTmdNjOtHlGy3nN/P/uHvTxfmytHwOhKKuwuGNch45Atj3wKrZQBxHjwBf8CylpyA39Z4REwYD5+/
 IibUmPH3
 kHBNGv1iZTBIQl5FdL1+hSucf0Il0BwwxQqHbhgg8EPCBN0qgrKctR89ohYC+02+Faheq5SYaYawCbpKbYf4gKEHlFlIaBd5mZXF81nyUSMKYSinsONdIycp6N0P17fpA4KopT95WY50ROUS4p4SZOWNeZuXdxaLmIQmua/jFP5eQh2I+O81rxNXJV9JqqOLJ0zyMlH8lXyfVvaxObDJjAdz8DyzSmd3y55jFf+UPts/ii3M=
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>
List-Subscribe: <mailto:majordomo@kvack.org>
List-Unsubscribe: <mailto:majordomo@kvack.org>

Added a new API swap_crypto_acomp_compress_batch() that does batch
compression. A system that has Intel IAA can avail of this API to submit a
batch of compress jobs for parallel compression in the hardware, to improve
performance. On a system without IAA, this API will process each compress
job sequentially.

The purpose of this API is to be invocable from any swap module that needs
to compress large folios, or a batch of pages in the general case. For
instance, zswap would batch compress up to SWAP_CRYPTO_SUB_BATCH_SIZE
(i.e. 8 if the system has IAA) pages in the large folio in parallel to
improve zswap_store() performance.

Towards this eventual goal:

1) The definition of "struct crypto_acomp_ctx" is moved to mm/swap.h
   so that mm modules like swap_state.c and zswap.c can reference it.
2) The swap_crypto_acomp_compress_batch() interface is implemented in
   swap_state.c.

It would be preferable for "struct crypto_acomp_ctx" to be defined in,
and for swap_crypto_acomp_compress_batch() to be exported via
include/linux/swap.h so that modules outside mm (for e.g. zram) can
potentially use the API for batch compressions with IAA. I would
appreciate RFC comments on this.

Signed-off-by: Kanchana P Sridhar <kanchana.p.sridhar@intel.com>
---
 mm/swap.h       |  45 +++++++++++++++++++
 mm/swap_state.c | 115 ++++++++++++++++++++++++++++++++++++++++++++++++
 mm/zswap.c      |   9 ----
 3 files changed, 160 insertions(+), 9 deletions(-)

diff --git a/mm/swap.h b/mm/swap.h
index 566616c971d4..4dcb67e2cc33 100644
--- a/mm/swap.h
+++ b/mm/swap.h
@@ -7,6 +7,7 @@ struct mempolicy;
 #ifdef CONFIG_SWAP
 #include <linux/swapops.h> /* for swp_offset */
 #include <linux/blk_types.h> /* for bio_end_io_t */
+#include <linux/crypto.h>
 
 /*
  * For IAA compression batching:
@@ -19,6 +20,39 @@ struct mempolicy;
 #define SWAP_CRYPTO_SUB_BATCH_SIZE 1UL
 #endif
 
+/* linux/mm/swap_state.c, zswap.c */
+struct crypto_acomp_ctx {
+	struct crypto_acomp *acomp;
+	struct acomp_req *req[SWAP_CRYPTO_SUB_BATCH_SIZE];
+	u8 *buffer[SWAP_CRYPTO_SUB_BATCH_SIZE];
+	struct crypto_wait wait;
+	struct mutex mutex;
+	bool is_sleepable;
+};
+
+/**
+ * This API provides IAA compress batching functionality for use by swap
+ * modules.
+ * The acomp_ctx mutex should be locked/unlocked before/after calling this
+ * procedure.
+ *
+ * @pages: Pages to be compressed.
+ * @dsts: Pre-allocated destination buffers to store results of IAA compression.
+ * @dlens: Will contain the compressed lengths.
+ * @errors: Will contain a 0 if the page was successfully compressed, or a
+ *          non-0 error value to be processed by the calling function.
+ * @nr_pages: The number of pages, up to SWAP_CRYPTO_SUB_BATCH_SIZE,
+ *            to be compressed.
+ * @acomp_ctx: The acomp context for iaa_crypto/other compressor.
+ */
+void swap_crypto_acomp_compress_batch(
+	struct page *pages[],
+	u8 *dsts[],
+	unsigned int dlens[],
+	int errors[],
+	int nr_pages,
+	struct crypto_acomp_ctx *acomp_ctx);
+
 /* linux/mm/page_io.c */
 int sio_pool_init(void);
 struct swap_iocb;
@@ -119,6 +153,17 @@ static inline int swap_zeromap_batch(swp_entry_t entry, int max_nr,
 
 #else /* CONFIG_SWAP */
 struct swap_iocb;
+struct crypto_acomp_ctx {};
+static inline void swap_crypto_acomp_compress_batch(
+	struct page *pages[],
+	u8 *dsts[],
+	unsigned int dlens[],
+	int errors[],
+	int nr_pages,
+	struct crypto_acomp_ctx *acomp_ctx)
+{
+}
+
 static inline void swap_read_folio(struct folio *folio, struct swap_iocb **plug)
 {
 }
diff --git a/mm/swap_state.c b/mm/swap_state.c
index 4669f29cf555..117c3caa5679 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -23,6 +23,8 @@
 #include <linux/swap_slots.h>
 #include <linux/huge_mm.h>
 #include <linux/shmem_fs.h>
+#include <linux/scatterlist.h>
+#include <crypto/acompress.h>
 #include "internal.h"
 #include "swap.h"
 
@@ -742,6 +744,119 @@ void exit_swap_address_space(unsigned int type)
 	swapper_spaces[type] = NULL;
 }
 
+#ifdef CONFIG_SWAP
+
+/**
+ * This API provides IAA compress batching functionality for use by swap
+ * modules.
+ * The acomp_ctx mutex should be locked/unlocked before/after calling this
+ * procedure.
+ *
+ * @pages: Pages to be compressed.
+ * @dsts: Pre-allocated destination buffers to store results of IAA compression.
+ * @dlens: Will contain the compressed lengths.
+ * @errors: Will contain a 0 if the page was successfully compressed, or a
+ *          non-0 error value to be processed by the calling function.
+ * @nr_pages: The number of pages, up to SWAP_CRYPTO_SUB_BATCH_SIZE,
+ *            to be compressed.
+ * @acomp_ctx: The acomp context for iaa_crypto/other compressor.
+ */
+void swap_crypto_acomp_compress_batch(
+	struct page *pages[],
+	u8 *dsts[],
+	unsigned int dlens[],
+	int errors[],
+	int nr_pages,
+	struct crypto_acomp_ctx *acomp_ctx)
+{
+	struct scatterlist inputs[SWAP_CRYPTO_SUB_BATCH_SIZE];
+	struct scatterlist outputs[SWAP_CRYPTO_SUB_BATCH_SIZE];
+	bool compressions_done = false;
+	int i, j;
+
+	BUG_ON(nr_pages > SWAP_CRYPTO_SUB_BATCH_SIZE);
+
+	/*
+	 * Prepare and submit acomp_reqs to IAA.
+	 * IAA will process these compress jobs in parallel in async mode.
+	 * If the compressor does not support a poll() method, or if IAA is
+	 * used in sync mode, the jobs will be processed sequentially using
+	 * acomp_ctx->req[0] and acomp_ctx->wait.
+	 */
+	for (i = 0; i < nr_pages; ++i) {
+		j = acomp_ctx->acomp->poll ? i : 0;
+		sg_init_table(&inputs[i], 1);
+		sg_set_page(&inputs[i], pages[i], PAGE_SIZE, 0);
+
+		/*
+		 * Each acomp_ctx->buffer[] is of size (PAGE_SIZE * 2).
+		 * Reflect same in sg_list.
+		 */
+		sg_init_one(&outputs[i], dsts[i], PAGE_SIZE * 2);
+		acomp_request_set_params(acomp_ctx->req[j], &inputs[i],
+					 &outputs[i], PAGE_SIZE, dlens[i]);
+
+		/*
+		 * If the crypto_acomp provides an asynchronous poll()
+		 * interface, submit the request to the driver now, and poll for
+		 * a completion status later, after all descriptors have been
+		 * submitted. If the crypto_acomp does not provide a poll()
+		 * interface, submit the request and wait for it to complete,
+		 * i.e., synchronously, before moving on to the next request.
+		 */
+		if (acomp_ctx->acomp->poll) {
+			errors[i] = crypto_acomp_compress(acomp_ctx->req[j]);
+
+			if (errors[i] != -EINPROGRESS)
+				errors[i] = -EINVAL;
+			else
+				errors[i] = -EAGAIN;
+		} else {
+			errors[i] = crypto_wait_req(
+					      crypto_acomp_compress(acomp_ctx->req[j]),
+					      &acomp_ctx->wait);
+			if (!errors[i])
+				dlens[i] = acomp_ctx->req[j]->dlen;
+		}
+	}
+
+	/*
+	 * If not doing async compressions, the batch has been processed at
+	 * this point and we can return.
+	 */
+	if (!acomp_ctx->acomp->poll)
+		return;
+
+	/*
+	 * Poll for and process IAA compress job completions
+	 * in out-of-order manner.
+	 */
+	while (!compressions_done) {
+		compressions_done = true;
+
+		for (i = 0; i < nr_pages; ++i) {
+			/*
+			 * Skip, if the compression has already completed
+			 * successfully or with an error.
+			 */
+			if (errors[i] != -EAGAIN)
+				continue;
+
+			errors[i] = crypto_acomp_poll(acomp_ctx->req[i]);
+
+			if (errors[i]) {
+				if (errors[i] == -EAGAIN)
+					compressions_done = false;
+			} else {
+				dlens[i] = acomp_ctx->req[i]->dlen;
+			}
+		}
+	}
+}
+EXPORT_SYMBOL_GPL(swap_crypto_acomp_compress_batch);
+
+#endif /* CONFIG_SWAP */
+
 static int swap_vma_ra_win(struct vm_fault *vmf, unsigned long *start,
 			   unsigned long *end)
 {
diff --git a/mm/zswap.c b/mm/zswap.c
index 579869d1bdf6..cab3114321f9 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -150,15 +150,6 @@ bool zswap_never_enabled(void)
 * data structures
 **********************************/
 
-struct crypto_acomp_ctx {
-	struct crypto_acomp *acomp;
-	struct acomp_req *req[SWAP_CRYPTO_SUB_BATCH_SIZE];
-	u8 *buffer[SWAP_CRYPTO_SUB_BATCH_SIZE];
-	struct crypto_wait wait;
-	struct mutex mutex;
-	bool is_sleepable;
-};
-
 /*
  * The lock ordering is zswap_tree.lock -> zswap_pool.lru_lock.
  * The only case where lru_lock is not acquired while holding tree.lock is

From patchwork Fri Oct 18 06:41:00 2024
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: "Sridhar, Kanchana P" <kanchana.p.sridhar@intel.com>
X-Patchwork-Id: 13841239
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id EEBE9D3C550
	for <linux-mm@archiver.kernel.org>; Fri, 18 Oct 2024 06:41:36 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id 7EA866B009D; Fri, 18 Oct 2024 02:41:16 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 74B0A6B009E; Fri, 18 Oct 2024 02:41:16 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 500F46B00A1; Fri, 18 Oct 2024 02:41:16 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com
 [216.40.44.10])
	by kanga.kvack.org (Postfix) with ESMTP id 29A106B009D
	for <linux-mm@kvack.org>; Fri, 18 Oct 2024 02:41:16 -0400 (EDT)
Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay02.hostedemail.com (Postfix) with ESMTP id F0F901209F0
	for <linux-mm@kvack.org>; Fri, 18 Oct 2024 06:41:04 +0000 (UTC)
X-FDA: 82685775732.06.73EDFB6
Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15])
	by imf27.hostedemail.com (Postfix) with ESMTP id A6A2B40005
	for <linux-mm@kvack.org>; Fri, 18 Oct 2024 06:41:02 +0000 (UTC)
Authentication-Results: imf27.hostedemail.com;
	dkim=pass header.d=intel.com header.s=Intel header.b=k60k+MRu;
	dmarc=pass (policy=none) header.from=intel.com;
	spf=pass (imf27.hostedemail.com: domain of kanchana.p.sridhar@intel.com
 designates 192.198.163.15 as permitted sender)
 smtp.mailfrom=kanchana.p.sridhar@intel.com
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729233554; a=rsa-sha256;
	cv=none;
	b=5EISsN2g3ldfT1RfvpFfrGA23RIBbk23hwHTDWsbtm8Yk/7hrQIAufNSp7wwnLSENccKIL
	l86IRAxmtbcAg9Q3KbL7m+cJFXpdrFfH4l/H5ScOcS4CRXh2KMyqtHOPV2H+T3R3OzeYXY
	ysmiNhSNisKDKnY5SmeVtVm5uGkBBwU=
ARC-Authentication-Results: i=1;
	imf27.hostedemail.com;
	dkim=pass header.d=intel.com header.s=Intel header.b=k60k+MRu;
	dmarc=pass (policy=none) header.from=intel.com;
	spf=pass (imf27.hostedemail.com: domain of kanchana.p.sridhar@intel.com
 designates 192.198.163.15 as permitted sender)
 smtp.mailfrom=kanchana.p.sridhar@intel.com
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed;
 d=hostedemail.com;
	s=arc-20220608; t=1729233554;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references:dkim-signature;
	bh=1bCS/ulDA8O8zLi2CbuDrCmwBiuAPux9DDHkPlyIksA=;
	b=JW6I6iwZDtXd+OScQm/67xWn4sUtE/7bqcW3t/jBuBR8J4gcwsTHDo1wcRkNWPm5Spmbnu
	RrOEwpc31uqE7W7s1Lc+0nHLeXKUIA8sd54WL5+URAesw9uRiC1ZHPJ3QBmEs1FdfeOdBS
	ymCV4OEbHm4vGNBJW8raRo4ha9byDqE=
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;
  d=intel.com; i=@intel.com; q=dns/txt; s=Intel;
  t=1729233673; x=1760769673;
  h=from:to:cc:subject:date:message-id:in-reply-to:
   references:mime-version:content-transfer-encoding;
  bh=O8jxodCGMWtrByAFoEsRBLTTLZfK8XqMptqufBIfhNw=;
  b=k60k+MRuZ7rIda8ghpUOzCaLmZ+hepK1mnQBVBiHzrbZPE1UM5lcn109
   phzGrAe6BCSPmSa3QPB6QA8SaRaMDnzc0Q8gKcJ3LYTu0NLtOKMDBKP9v
   62B3ZGcEK1hUG8R5rUtMNpnaqVBwG/DiDQO/VsSmuPpEgEcqLJZr+CWzq
   yfKmU4Iz1lQzWUycXK00oKftq0OA3VkFs7C4qIrhGp0Fmc1xTY0jbHGtn
   /9dmUQbqF9DXxXead5R5iolHrFw3M6EooCIVUGlL/6pUrIDAR2N4DsoRQ
   qKx50DeZYGNEB0ga/jdytlAJe22/hqEBKr4o9esxf+f63Z1wgToO5CgKR
   Q==;
X-CSE-ConnectionGUID: FkeAQn4yS4yJ04Bx/BGCiQ==
X-CSE-MsgGUID: IcKOgFxvRviXlztfAIyAsA==
X-IronPort-AV: E=McAfee;i="6700,10204,11228"; a="28884961"
X-IronPort-AV: E=Sophos;i="6.11,212,1725346800";
   d="scan'208";a="28884961"
Received: from orviesa003.jf.intel.com ([10.64.159.143])
  by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 17 Oct 2024 23:41:04 -0700
X-CSE-ConnectionGUID: +AFCy/XaR8CvOJBAh1gLlQ==
X-CSE-MsgGUID: WK41Ugd7SZ2i1qeniXmNog==
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="6.11,199,1725346800";
   d="scan'208";a="83607538"
Received: from jf5300-b11a338t.jf.intel.com ([10.242.51.6])
  by orviesa003.jf.intel.com with ESMTP; 17 Oct 2024 23:41:04 -0700
From: Kanchana P Sridhar <kanchana.p.sridhar@intel.com>
To: linux-kernel@vger.kernel.org,
	linux-mm@kvack.org,
	hannes@cmpxchg.org,
	yosryahmed@google.com,
	nphamcs@gmail.com,
	chengming.zhou@linux.dev,
	usamaarif642@gmail.com,
	ryan.roberts@arm.com,
	ying.huang@intel.com,
	21cnbao@gmail.com,
	akpm@linux-foundation.org,
	linux-crypto@vger.kernel.org,
	herbert@gondor.apana.org.au,
	davem@davemloft.net,
	clabbe@baylibre.com,
	ardb@kernel.org,
	ebiggers@google.com,
	surenb@google.com,
	kristen.c.accardi@intel.com,
	zanussi@kernel.org,
	viro@zeniv.linux.org.uk,
	brauner@kernel.org,
	jack@suse.cz,
	mcgrof@kernel.org,
	kees@kernel.org,
	joel.granados@kernel.org,
	bfoster@redhat.com,
	willy@infradead.org,
	linux-fsdevel@vger.kernel.org
Cc: wajdi.k.feghali@intel.com,
	vinodh.gopal@intel.com,
	kanchana.p.sridhar@intel.com
Subject: [RFC PATCH v1 12/13] mm: zswap: Compress batching with Intel IAA in
 zswap_store() of large folios.
Date: Thu, 17 Oct 2024 23:41:00 -0700
Message-Id: <20241018064101.336232-13-kanchana.p.sridhar@intel.com>
X-Mailer: git-send-email 2.27.0
In-Reply-To: <20241018064101.336232-1-kanchana.p.sridhar@intel.com>
References: <20241018064101.336232-1-kanchana.p.sridhar@intel.com>
MIME-Version: 1.0
X-Rspamd-Server: rspam12
X-Rspamd-Queue-Id: A6A2B40005
X-Stat-Signature: 8gbmx59hcc8kp7177gd355o57ybzptuh
X-Rspam-User: 
X-HE-Tag: 1729233662-462765
X-HE-Meta: 
 U2FsdGVkX1+KNg3RtR9vHdToNJFuJfDgW4FNd9oQUb/loh2F4hgcMz8OOlovRTJsmylYlufj1zlRV865HVS0qas4YE0Sknkj07ktY9xeOkYEGr7A2z73dt749fXWVCzZGovcZnzRY6BoyFJhjm59O3c0qahh1okW2J/V+7Gh1xJo2UgmDMHObk24J85XH3W76De6F6MPCOcE6r9SdwUhNtRZPILD6hMyoorl700BPKc08mBeJWbtSpGqH9zwToFhIhSMrZ62aiahsd0X9shlSiLUPPvkpqOgmQaXLbxYdNj1HIaHzEqu8/6jx76Cu9NATLIwZPQlHTdb/4+R20WWV2oNJNY1d2A+olClrcr7RVFSBDc/tNYaRbptyNv+AqFK5f76IMHWa8oQMRVpPQVBZKXkdgu3O2+1762CXgTajlCe84W16TrvGWls54dFzTQMcApn8zkFxcE+HhG0OirxO3Ub0A9DNzYHwYTWcC0Ona9WJ0Al6hOZVeXSEKbZEWd9clsODf9oU5LyPnsb1iBJ9f8VGustYoWYp47CboFATl52J7McQXoWHPF2nyDWGsOnV/RQxNIx0gPEb0pm6+G6xyV7CRX1JGByh21bFQRuY94vQYHM4fFnbWN0UJt/LcC/MUle5etIW1tZr6UnRiNrRLKp9e0EuvgMxN+V93uJQ15lADUVQv3BCZ3iP0jEQkoHlNwrRxVZ6fuOgexwXt7CAK5EKgzslTRLR5QHlLXrsEOWH6rZei1BP8+LhQtiO9q+o49xRwHFXKtCOO0FhQAs31ApTHVBWAQAI81GAhr+Mdz5VbUGT75Z8M5/TcnIubRHd8qdaiW2d+BYGEvKpp87xwIoKBexjcjO/fcPccEP7G6bTqhEqCTt+EGgm1f8tgOzZBzUVspQRZ68xNlrdqfKU7pZCy1xmPHV2Tcui5cEXLy7Noqm73rOOzPqwdrmdE3PJH/jhxH9jiGol3psfeL
 shYjjBDU
 7iIKV3yXzdGK7TDxGWP2Ms63GeGze/qfeMcdZsXda+LP2tlBZrLAWoZv8f/tMIAJpi5ErIkC9dRzYk2We42NJtUXvIuMz3GUqmC9zEVjh88KA9cRFm/x//5iuI67TYoNaBKW80VDgu4w4ZuZUdWhyXQ4IhaGSe4H4k9om3ijkwlC3leguADJtM/A8j1xIPRYnOnBtI5C5ArZVyuTqk+6Edh5P++daE08rta6r
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>
List-Subscribe: <mailto:majordomo@kvack.org>
List-Unsubscribe: <mailto:majordomo@kvack.org>

If the system has Intel IAA, and if CONFIG_ZSWAP_STORE_BATCHING_ENABLED
is set to "y", zswap_store() will call swap_crypto_acomp_compress_batch()
to batch compress up to SWAP_CRYPTO_SUB_BATCH_SIZE pages in large folios
in parallel using the multiple compress engines available in IAA hardware.

On platforms with multiple IAA devices per socket, compress jobs from all
cores in a socket will be distributed among all IAA devices on the socket
by the iaa_crypto driver.

If zswap_store() is called with a large folio, and if
zswap_store_batching_enabled() returns "true", it will call the
main __zswap_store_batch_core() interface for compress batching. The
interface represents the extensible compress batching architecture that can
potentially be called with a batch of any-order folios from
shrink_folio_list(). In other words, although zswap_store()
calls __zswap_store_batch_core() with exactly one large folio in this
patch, we will reuse this API to reclaim a batch of folios in subsequent
patches.

The newly added functions that implement batched stores follow the
general structure of zswap_store() of a large folio. Some amount of
restructuring and optimization is done to minimize failure points
for a batch, fail early and maximize the zswap store pipeline occupancy
with SWAP_CRYPTO_SUB_BATCH_SIZE pages, potentially from multiple
folios. This is intended to maximize reclaim throughput with the IAA
hardware parallel compressions.

Signed-off-by: Kanchana P Sridhar <kanchana.p.sridhar@intel.com>
---
 include/linux/zswap.h |  84 ++++++
 mm/zswap.c            | 591 +++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 671 insertions(+), 4 deletions(-)

diff --git a/include/linux/zswap.h b/include/linux/zswap.h
index 74ad2a24b309..9bbe330686f6 100644
--- a/include/linux/zswap.h
+++ b/include/linux/zswap.h
@@ -24,6 +24,88 @@ struct zswap_lruvec_state {
 	atomic_long_t nr_disk_swapins;
 };
 
+/*
+ * struct zswap_store_sub_batch_page:
+ *
+ * This represents one "zswap batching element", namely, the
+ * attributes associated with a page in a large folio that will
+ * be compressed and stored in zswap. The term "batch" is reserved
+ * for a conceptual "batch" of folios that can be sent to
+ * zswap_store() by reclaim. The term "sub-batch" is used to describe
+ * a collection of "zswap batching elements", i.e., an array of
+ * "struct zswap_store_sub_batch_page *".
+ *
+ * The zswap compress sub-batch size is specified by
+ * SWAP_CRYPTO_SUB_BATCH_SIZE, currently set as 8UL if the
+ * platform has Intel IAA. This means zswap can store a large folio
+ * by creating sub-batches of up to 8 pages and compressing this
+ * batch using IAA to parallelize the 8 compress jobs in hardware.
+ * For e.g., a 64KB folio can be compressed as 2 sub-batches of
+ * 8 pages each. This can significantly improve the zswap_store()
+ * performance for large folios.
+ *
+ * Although the page itself is represented directly, the structure
+ * adds a "u8 batch_idx" to represent an index for the folio in a
+ * conceptual "batch of folios" that can be passed to zswap_store().
+ * Conceptually, this allows for up to 256 folios that can be passed
+ * to zswap_store(). If this conceptual number of folios sent to
+ * zswap_store() exceeds 256, the "batch_idx" needs to become u16.
+ */
+struct zswap_store_sub_batch_page {
+	u8 batch_idx;
+	swp_entry_t swpentry;
+	struct obj_cgroup *objcg;
+	struct zswap_entry *entry;
+	int error; /* folio error status. */
+};
+
+/*
+ * struct zswap_store_pipeline_state:
+ *
+ * This stores state during IAA compress batching of (conceptually, a batch of)
+ * folios. The term pipelining in this context, refers to breaking down
+ * the batch of folios being reclaimed into sub-batches of
+ * SWAP_CRYPTO_SUB_BATCH_SIZE pages, batch compressing and storing the
+ * sub-batch. This concept could be further evolved to use overlap of CPU
+ * computes with IAA computes. For instance, we could stage the post-compress
+ * computes for sub-batch "N-1" to happen in parallel with IAA batch
+ * compression of sub-batch "N".
+ *
+ * We begin by developing the concept of compress batching. Pipelining with
+ * overlap can be future work.
+ *
+ * @errors: The errors status for the batch of reclaim folios passed in from
+ *          a higher mm layer such as swap_writepage().
+ * @pool: A valid zswap_pool.
+ * @acomp_ctx: The per-cpu pointer to the crypto_acomp_ctx for the @pool.
+ * @sub_batch: This is an array that represents the sub-batch of up to
+ *             SWAP_CRYPTO_SUB_BATCH_SIZE pages that are being stored
+ *             in zswap.
+ * @comp_dsts: The destination buffers for crypto_acomp_compress() for each
+ *             page being compressed.
+ * @comp_dlens: The destination buffers' lengths from crypto_acomp_compress()
+ *              obtained after crypto_acomp_poll() returns completion status,
+ *              for each page being compressed.
+ * @comp_errors: Compression errors for each page being compressed.
+ * @nr_comp_pages: Total number of pages in @sub_batch.
+ *
+ * Note:
+ * The max sub-batch size is SWAP_CRYPTO_SUB_BATCH_SIZE, currently 8UL.
+ * Hence, if SWAP_CRYPTO_SUB_BATCH_SIZE exceeds 256, some of the
+ * u8 members (except @comp_dsts) need to become u16.
+ */
+struct zswap_store_pipeline_state {
+	int *errors;
+	struct zswap_pool *pool;
+	struct crypto_acomp_ctx *acomp_ctx;
+	struct zswap_store_sub_batch_page *sub_batch;
+	struct page **comp_pages;
+	u8 **comp_dsts;
+	unsigned int *comp_dlens;
+	int *comp_errors;
+	u8 nr_comp_pages;
+};
+
 bool zswap_store_batching_enabled(void);
 unsigned long zswap_total_pages(void);
 bool zswap_store(struct folio *folio);
@@ -39,6 +121,8 @@ bool zswap_never_enabled(void);
 #else
 
 struct zswap_lruvec_state {};
+struct zswap_store_sub_batch_page {};
+struct zswap_store_pipeline_state {};
 
 static inline bool zswap_store_batching_enabled(void)
 {
diff --git a/mm/zswap.c b/mm/zswap.c
index cab3114321f9..1c12a7b9f4ff 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -130,7 +130,7 @@ module_param_named(shrinker_enabled, zswap_shrinker_enabled, bool, 0644);
 /*
  * Enable/disable batching of compressions if zswap_store is called with a
  * large folio. If enabled, and if IAA is the zswap compressor, pages are
- * compressed in parallel in batches of say, 8 pages.
+ * compressed in parallel in batches of SWAP_CRYPTO_SUB_BATCH_SIZE pages.
  * If not, every page is compressed sequentially.
  */
 static bool __zswap_store_batching_enabled = IS_ENABLED(
@@ -246,6 +246,12 @@ __always_inline bool zswap_store_batching_enabled(void)
 	return __zswap_store_batching_enabled;
 }
 
+static void __zswap_store_batch_core(
+	int node_id,
+	struct folio **folios,
+	int *errors,
+	unsigned int nr_folios);
+
 /*********************************
 * pool functions
 **********************************/
@@ -906,6 +912,9 @@ static int zswap_cpu_comp_dead(unsigned int cpu, struct hlist_node *node)
 	return 0;
 }
 
+/*
+ * The acomp_ctx->mutex must be locked/unlocked in the calling procedure.
+ */
 static bool zswap_compress(struct page *page, struct zswap_entry *entry,
 			   struct zswap_pool *pool)
 {
@@ -921,8 +930,6 @@ static bool zswap_compress(struct page *page, struct zswap_entry *entry,
 
 	acomp_ctx = raw_cpu_ptr(pool->acomp_ctx);
 
-	mutex_lock(&acomp_ctx->mutex);
-
 	dst = acomp_ctx->buffer[0];
 	sg_init_table(&input, 1);
 	sg_set_page(&input, page, PAGE_SIZE, 0);
@@ -992,7 +999,6 @@ static bool zswap_compress(struct page *page, struct zswap_entry *entry,
 	else if (alloc_ret)
 		zswap_reject_alloc_fail++;
 
-	mutex_unlock(&acomp_ctx->mutex);
 	return comp_ret == 0 && alloc_ret == 0;
 }
 
@@ -1545,10 +1551,17 @@ static ssize_t zswap_store_page(struct page *page,
 	return -EINVAL;
 }
 
+/*
+ * Modified to use the IAA compress batching framework implemented in
+ * __zswap_store_batch_core() if zswap_store_batching_enabled() is true.
+ * The batching code is intended to significantly improve folio store
+ * performance over the sequential code.
+ */
 bool zswap_store(struct folio *folio)
 {
 	long nr_pages = folio_nr_pages(folio);
 	swp_entry_t swp = folio->swap;
+	struct crypto_acomp_ctx *acomp_ctx;
 	struct obj_cgroup *objcg = NULL;
 	struct mem_cgroup *memcg = NULL;
 	struct zswap_pool *pool;
@@ -1556,6 +1569,17 @@ bool zswap_store(struct folio *folio)
 	bool ret = false;
 	long index;
 
+	/*
+	 * Improve large folio zswap_store() latency with IAA compress batching.
+	 */
+	if (folio_test_large(folio) && zswap_store_batching_enabled()) {
+		int error = -1;
+		__zswap_store_batch_core(folio_nid(folio), &folio, &error, 1);
+		if (!error)
+			ret = true;
+		return ret;
+	}
+
 	VM_WARN_ON_ONCE(!folio_test_locked(folio));
 	VM_WARN_ON_ONCE(!folio_test_swapcache(folio));
 
@@ -1588,6 +1612,9 @@ bool zswap_store(struct folio *folio)
 		mem_cgroup_put(memcg);
 	}
 
+	acomp_ctx = raw_cpu_ptr(pool->acomp_ctx);
+	mutex_lock(&acomp_ctx->mutex);
+
 	for (index = 0; index < nr_pages; ++index) {
 		struct page *page = folio_page(folio, index);
 		ssize_t bytes;
@@ -1609,6 +1636,7 @@ bool zswap_store(struct folio *folio)
 	ret = true;
 
 put_pool:
+	mutex_unlock(&acomp_ctx->mutex);
 	zswap_pool_put(pool);
 put_objcg:
 	obj_cgroup_put(objcg);
@@ -1638,6 +1666,561 @@ bool zswap_store(struct folio *folio)
 	return ret;
 }
 
+/*
+ * Note: If SWAP_CRYPTO_SUB_BATCH_SIZE exceeds 256, change the
+ * u8 stack variables in the next several functions, to u16.
+ */
+
+/*
+ * Propagate the "sbp" error condition to other batch elements belonging to
+ * the same folio as "sbp".
+ */
+static __always_inline void zswap_store_propagate_errors(
+	struct zswap_store_pipeline_state *zst,
+	u8 error_batch_idx)
+{
+	u8 i;
+
+	if (zst->errors[error_batch_idx])
+		return;
+
+	for (i = 0; i < zst->nr_comp_pages; ++i) {
+		struct zswap_store_sub_batch_page *sbp = &zst->sub_batch[i];
+
+		if (sbp->batch_idx == error_batch_idx) {
+			if (!sbp->error) {
+				if (!IS_ERR_VALUE(sbp->entry->handle))
+					zpool_free(zst->pool->zpool, sbp->entry->handle);
+
+				if (sbp->entry) {
+					zswap_entry_cache_free(sbp->entry);
+					sbp->entry = NULL;
+				}
+				sbp->error = -EINVAL;
+			}
+		}
+	}
+
+	/*
+	 * Set zswap status for the folio to "error"
+	 * for use in swap_writepage.
+	 */
+	zst->errors[error_batch_idx] = -EINVAL;
+}
+
+static __always_inline void zswap_process_comp_errors(
+	struct zswap_store_pipeline_state *zst)
+{
+	u8 i;
+
+	for (i = 0; i < zst->nr_comp_pages; ++i) {
+		struct zswap_store_sub_batch_page *sbp = &zst->sub_batch[i];
+
+		if (zst->comp_errors[i]) {
+			if (zst->comp_errors[i] == -ENOSPC)
+				zswap_reject_compress_poor++;
+			else
+				zswap_reject_compress_fail++;
+
+			if (!sbp->error)
+				zswap_store_propagate_errors(zst,
+							     sbp->batch_idx);
+		}
+	}
+}
+
+static void zswap_compress_batch(struct zswap_store_pipeline_state *zst)
+{
+	/*
+	 * Compress up to SWAP_CRYPTO_SUB_BATCH_SIZE pages.
+	 * If IAA is the zswap compressor, this compresses the
+	 * pages in parallel, leading to significant performance
+	 * improvements as compared to software compressors.
+	 */
+	swap_crypto_acomp_compress_batch(
+		zst->comp_pages,
+		zst->comp_dsts,
+		zst->comp_dlens,
+		zst->comp_errors,
+		zst->nr_comp_pages,
+		zst->acomp_ctx);
+
+	/*
+	 * Scan the sub-batch for any compression errors,
+	 * and invalidate pages with errors, along with other
+	 * pages belonging to the same folio as the error pages.
+	 */
+	zswap_process_comp_errors(zst);
+}
+
+static void zswap_zpool_store_sub_batch(
+	struct zswap_store_pipeline_state *zst)
+{
+	u8 i;
+
+	for (i = 0; i < zst->nr_comp_pages; ++i) {
+		struct zswap_store_sub_batch_page *sbp = &zst->sub_batch[i];
+		struct zpool *zpool;
+		unsigned long handle;
+		char *buf;
+		gfp_t gfp;
+		int err;
+
+		/* Skip pages that had compress errors. */
+		if (sbp->error)
+			continue;
+
+		zpool = zst->pool->zpool;
+		gfp = __GFP_NORETRY | __GFP_NOWARN | __GFP_KSWAPD_RECLAIM;
+		if (zpool_malloc_support_movable(zpool))
+			gfp |= __GFP_HIGHMEM | __GFP_MOVABLE;
+		err = zpool_malloc(zpool, zst->comp_dlens[i], gfp, &handle);
+
+		if (err) {
+			if (err == -ENOSPC)
+				zswap_reject_compress_poor++;
+			else
+				zswap_reject_alloc_fail++;
+
+			/*
+			 * An error should be propagated to other pages of the
+			 * same folio in the sub-batch, and zpool resources for
+			 * those pages (in sub-batch order prior to this zpool
+			 * error) should be de-allocated.
+			 */
+			zswap_store_propagate_errors(zst, sbp->batch_idx);
+			continue;
+		}
+
+		buf = zpool_map_handle(zpool, handle, ZPOOL_MM_WO);
+		memcpy(buf, zst->comp_dsts[i], zst->comp_dlens[i]);
+		zpool_unmap_handle(zpool, handle);
+
+		sbp->entry->handle = handle;
+		sbp->entry->length = zst->comp_dlens[i];
+	}
+}
+
+/*
+ * Returns true if the entry was successfully
+ * stored in the xarray, and false otherwise.
+ */
+static bool zswap_store_entry(swp_entry_t page_swpentry,
+			      struct zswap_entry *entry)
+{
+	struct zswap_entry *old = xa_store(swap_zswap_tree(page_swpentry),
+					   swp_offset(page_swpentry),
+					   entry, GFP_KERNEL);
+	if (xa_is_err(old)) {
+		int err = xa_err(old);
+
+		WARN_ONCE(err != -ENOMEM, "unexpected xarray error: %d\n", err);
+		zswap_reject_alloc_fail++;
+		return false;
+	}
+
+	/*
+	 * We may have had an existing entry that became stale when
+	 * the folio was redirtied and now the new version is being
+	 * swapped out. Get rid of the old.
+	 */
+	if (old)
+		zswap_entry_free(old);
+
+	return true;
+}
+
+static void zswap_batch_compress_post_proc(
+	struct zswap_store_pipeline_state *zst)
+{
+	int nr_objcg_pages = 0, nr_pages = 0;
+	struct obj_cgroup *objcg = NULL;
+	size_t compressed_bytes = 0;
+	u8 i;
+
+	zswap_zpool_store_sub_batch(zst);
+
+	for (i = 0; i < zst->nr_comp_pages; ++i) {
+		struct zswap_store_sub_batch_page *sbp = &zst->sub_batch[i];
+
+		if (sbp->error)
+			continue;
+
+		if (!zswap_store_entry(sbp->swpentry, sbp->entry)) {
+			zswap_store_propagate_errors(zst, sbp->batch_idx);
+			continue;
+		}
+
+		/*
+		 * The entry is successfully compressed and stored in the tree,
+		 * there is no further possibility of failure. Grab refs to the
+		 * pool and objcg. These refs will be dropped by
+		 * zswap_entry_free() when the entry is removed from the tree.
+		 */
+		zswap_pool_get(zst->pool);
+		if (sbp->objcg)
+			obj_cgroup_get(sbp->objcg);
+
+		/*
+		 * We finish initializing the entry while it's already in xarray.
+		 * This is safe because:
+		 *
+		 * 1. Concurrent stores and invalidations are excluded by folio
+		 *    lock.
+		 *
+		 * 2. Writeback is excluded by the entry not being on the LRU yet.
+		 *    The publishing order matters to prevent writeback from seeing
+		 *    an incoherent entry.
+		 */
+		sbp->entry->pool = zst->pool;
+		sbp->entry->swpentry = sbp->swpentry;
+		sbp->entry->objcg = sbp->objcg;
+		sbp->entry->referenced = true;
+		if (sbp->entry->length) {
+			INIT_LIST_HEAD(&sbp->entry->lru);
+			zswap_lru_add(&zswap_list_lru, sbp->entry);
+		}
+
+		if (!objcg && sbp->objcg) {
+			objcg = sbp->objcg;
+		} else if (objcg && sbp->objcg && (objcg != sbp->objcg)) {
+			obj_cgroup_charge_zswap(objcg, compressed_bytes);
+			count_objcg_events(objcg, ZSWPOUT, nr_objcg_pages);
+			compressed_bytes = 0;
+			nr_objcg_pages = 0;
+			objcg = sbp->objcg;
+		}
+
+		if (sbp->objcg) {
+			compressed_bytes += sbp->entry->length;
+			++nr_objcg_pages;
+		}
+
+		++nr_pages;
+	} /* for sub-batch pages. */
+
+	if (objcg) {
+		obj_cgroup_charge_zswap(objcg, compressed_bytes);
+		count_objcg_events(objcg, ZSWPOUT, nr_objcg_pages);
+	}
+
+	atomic_long_add(nr_pages, &zswap_stored_pages);
+	count_vm_events(ZSWPOUT, nr_pages);
+}
+
+static void zswap_store_sub_batch(struct zswap_store_pipeline_state *zst)
+{
+	u8 i;
+
+	for (i = 0; i < zst->nr_comp_pages; ++i) {
+		zst->comp_dsts[i] = zst->acomp_ctx->buffer[i];
+		zst->comp_dlens[i] = PAGE_SIZE;
+	} /* for sub-batch pages. */
+
+	/*
+	 * Batch compress sub-batch "N". If IAA is the compressor, the
+	 * hardware will compress multiple pages in parallel.
+	 */
+	zswap_compress_batch(zst);
+
+	zswap_batch_compress_post_proc(zst);
+}
+
+static void zswap_add_folio_pages_to_sb(
+	struct zswap_store_pipeline_state *zst,
+	struct folio* folio,
+	u8 batch_idx,
+	struct obj_cgroup *objcg,
+	struct zswap_entry *entries[],
+	long start_idx,
+	u8 add_nr_pages)
+{
+	long index;
+
+	for (index = start_idx; index < (start_idx + add_nr_pages); ++index) {
+		u8 i = zst->nr_comp_pages;
+		struct zswap_store_sub_batch_page *sbp = &zst->sub_batch[i];
+		struct page *page = folio_page(folio, index);
+		zst->comp_pages[i] = page;
+		sbp->swpentry = page_swap_entry(page);
+		sbp->batch_idx = batch_idx;
+		sbp->objcg = objcg;
+		sbp->entry = entries[index - start_idx];
+		sbp->error = 0;
+		++zst->nr_comp_pages;
+	}
+}
+
+static __always_inline void zswap_store_reset_sub_batch(
+	struct zswap_store_pipeline_state *zst)
+{
+	zst->nr_comp_pages = 0;
+}
+
+/* Allocate entries for the next sub-batch. */
+static int zswap_alloc_entries(u8 nr_entries,
+			       struct zswap_entry *entries[],
+			       int node_id)
+{
+	u8 i;
+
+	for (i = 0; i < nr_entries; ++i) {
+		entries[i] = zswap_entry_cache_alloc(GFP_KERNEL, node_id);
+		if (!entries[i]) {
+			u8 j;
+
+			zswap_reject_kmemcache_fail++;
+			for (j = 0; j < i; ++j)
+				zswap_entry_cache_free(entries[j]);
+			return -EINVAL;
+		}
+
+		entries[i]->handle = (unsigned long)ERR_PTR(-EINVAL);
+	}
+
+	return 0;
+}
+
+/*
+ * If the zswap store fails or zswap is disabled, we must invalidate
+ * the possibly stale entries which were previously stored at the
+ * offsets corresponding to each page of the folio. Otherwise,
+ * writeback could overwrite the new data in the swapfile.
+ */
+static void zswap_delete_stored_entries(struct folio *folio)
+{
+	swp_entry_t swp = folio->swap;
+	unsigned type = swp_type(swp);
+	pgoff_t offset = swp_offset(swp);
+	struct zswap_entry *entry;
+	struct xarray *tree;
+	long index;
+
+	for (index = 0; index < folio_nr_pages(folio); ++index) {
+		tree = swap_zswap_tree(swp_entry(type, offset + index));
+		entry = xa_erase(tree, offset + index);
+		if (entry)
+			zswap_entry_free(entry);
+	}
+}
+
+static void zswap_store_process_folio_errors(
+	struct folio **folios,
+	int *errors,
+	unsigned int nr_folios)
+{
+	u8 batch_idx;
+
+	for (batch_idx = 0; batch_idx < nr_folios; ++batch_idx)
+		if (errors[batch_idx])
+			zswap_delete_stored_entries(folios[batch_idx]);
+}
+
+/*
+ * Store a (batch of) any-order large folio(s) in zswap. Each folio will be
+ * broken into sub-batches of SWAP_CRYPTO_SUB_BATCH_SIZE pages, the
+ * sub-batch will be compressed by IAA in parallel, and stored in zpool/xarray.
+ *
+ * This the main procedure for batching of folios, and batching within
+ * large folios.
+ *
+ * This procedure should only be called if zswap supports batching of stores.
+ * Otherwise, the sequential implementation for storing folios as in the
+ * current zswap_store() should be used.
+ *
+ * The signature of this procedure is meant to allow the calling function,
+ * (for instance, swap_writepage()) to pass an array @folios
+ * (the "reclaim batch") of @nr_folios folios to be stored in zswap.
+ * All folios in the batch must have the same swap type and folio_nid @node_id
+ * (simplifying assumptions only to manage code complexity).
+ *
+ * @errors and @folios have @nr_folios number of entries, with one-one
+ * correspondence (@errors[i] represents the error status of @folios[i],
+ * for i in @nr_folios).
+ * The calling function (for instance, swap_writepage()) should initialize
+ * @errors[i] to a non-0 value.
+ * If zswap successfully stores @folios[i], it will set @errors[i] to 0.
+ * If there is an error in zswap, it will set @errors[i] to -EINVAL.
+ */
+static void __zswap_store_batch_core(
+	int node_id,
+	struct folio **folios,
+	int *errors,
+	unsigned int nr_folios)
+{
+	struct zswap_store_sub_batch_page sub_batch[SWAP_CRYPTO_SUB_BATCH_SIZE];
+	struct page *comp_pages[SWAP_CRYPTO_SUB_BATCH_SIZE];
+	u8 *comp_dsts[SWAP_CRYPTO_SUB_BATCH_SIZE] = { NULL };
+	unsigned int comp_dlens[SWAP_CRYPTO_SUB_BATCH_SIZE];
+	int comp_errors[SWAP_CRYPTO_SUB_BATCH_SIZE];
+	struct crypto_acomp_ctx *acomp_ctx;
+	struct zswap_pool *pool;
+	/*
+	 * For now, lets say a max of 256 large folios can be reclaimed
+	 * at a time, as a batch. If this exceeds 256, change this to u16.
+	 */
+	u8 batch_idx;
+
+	/* Initialize the compress batching pipeline state. */
+	struct zswap_store_pipeline_state zst = {
+		.errors = errors,
+		.pool = NULL,
+		.acomp_ctx = NULL,
+		.sub_batch = sub_batch,
+		.comp_pages = comp_pages,
+		.comp_dsts = comp_dsts,
+		.comp_dlens = comp_dlens,
+		.comp_errors = comp_errors,
+		.nr_comp_pages = 0,
+	};
+
+	pool = zswap_pool_current_get();
+	if (!pool) {
+		if (zswap_check_limits())
+			queue_work(shrink_wq, &zswap_shrink_work);
+		goto check_old;
+	}
+
+	acomp_ctx = raw_cpu_ptr(pool->acomp_ctx);
+	mutex_lock(&acomp_ctx->mutex);
+	zst.pool = pool;
+	zst.acomp_ctx = acomp_ctx;
+
+	/*
+	 * Iterate over the folios passed in. Construct sub-batches of up to
+	 * SWAP_CRYPTO_SUB_BATCH_SIZE pages, if necessary, by iterating through
+	 * multiple folios from the input "folios". Process each sub-batch
+	 * with IAA batch compression. Detect errors from batch compression
+	 * and set the impacted folio's error status (this happens in
+	 * zswap_store_process_errors()).
+	 */
+	for (batch_idx = 0; batch_idx < nr_folios; ++batch_idx) {
+		struct folio *folio = folios[batch_idx];
+		BUG_ON(!folio);
+		long folio_start_idx, nr_pages = folio_nr_pages(folio);
+		struct zswap_entry *entries[SWAP_CRYPTO_SUB_BATCH_SIZE];
+		struct obj_cgroup *objcg = NULL;
+		struct mem_cgroup *memcg = NULL;
+
+		VM_WARN_ON_ONCE(!folio_test_locked(folio));
+		VM_WARN_ON_ONCE(!folio_test_swapcache(folio));
+
+		/*
+		 * If zswap is disabled, we must invalidate the possibly stale entry
+		 * which was previously stored at this offset. Otherwise, writeback
+		 * could overwrite the new data in the swapfile.
+		 */
+		if (!zswap_enabled)
+			continue;
+
+		/* Check cgroup limits */
+		objcg = get_obj_cgroup_from_folio(folio);
+		if (objcg && !obj_cgroup_may_zswap(objcg)) {
+			memcg = get_mem_cgroup_from_objcg(objcg);
+			if (shrink_memcg(memcg)) {
+				mem_cgroup_put(memcg);
+				goto put_objcg;
+			}
+			mem_cgroup_put(memcg);
+		}
+
+		if (zswap_check_limits())
+			goto put_objcg;
+
+		if (objcg) {
+			memcg = get_mem_cgroup_from_objcg(objcg);
+			if (memcg_list_lru_alloc(memcg, &zswap_list_lru, GFP_KERNEL)) {
+				mem_cgroup_put(memcg);
+				goto put_objcg;
+			}
+			mem_cgroup_put(memcg);
+		}
+
+		/*
+		 * By default, set zswap status to "success" for use in
+		 * swap_writepage() when this returns. In case of errors,
+		 * a negative error number will over-write this when
+		 * zswap_store_process_errors() is called.
+		 */
+		errors[batch_idx] = 0;
+
+		folio_start_idx = 0;
+
+		while (nr_pages > 0) {
+			u8 add_nr_pages;
+
+			/*
+			 * If we have accumulated SWAP_CRYPTO_SUB_BATCH_SIZE
+			 * pages, process the sub-batch: it could contain pages
+			 * from multiple folios.
+			 */
+			if (zst.nr_comp_pages == SWAP_CRYPTO_SUB_BATCH_SIZE) {
+				zswap_store_sub_batch(&zst);
+				zswap_store_reset_sub_batch(&zst);
+				/*
+				 * Stop processing this folio if it had
+				 * compress errors.
+				 */
+				if (errors[batch_idx])
+					goto put_objcg;
+			}
+
+			add_nr_pages = min3((
+					(long)SWAP_CRYPTO_SUB_BATCH_SIZE -
+					(long)zst.nr_comp_pages),
+					nr_pages,
+					(long)SWAP_CRYPTO_SUB_BATCH_SIZE);
+
+			/*
+			 * Allocate zswap_entries for this sub-batch. If we
+			 * get errors while doing so, we can flag an error
+			 * for the folio, call the shrinker and move on.
+			 */
+			if (zswap_alloc_entries(add_nr_pages,
+						entries, node_id)) {
+				zswap_store_reset_sub_batch(&zst);
+				errors[batch_idx] = -EINVAL;
+				goto put_objcg;
+			}
+
+			zswap_add_folio_pages_to_sb(
+				&zst,
+				folio,
+				batch_idx,
+				objcg,
+				entries,
+				folio_start_idx,
+				add_nr_pages);
+
+			nr_pages -= add_nr_pages;
+			folio_start_idx += add_nr_pages;
+		} /* this folio has pages to be compressed. */
+
+		obj_cgroup_put(objcg);
+		continue;
+
+put_objcg:
+		obj_cgroup_put(objcg);
+		if (zswap_pool_reached_full)
+			queue_work(shrink_wq, &zswap_shrink_work);
+	} /* for batch folios */
+
+	if (!zswap_enabled)
+		goto check_old;
+
+	/*
+	 * Process last sub-batch: it could contain pages from
+	 * multiple folios.
+	 */
+	if (zst.nr_comp_pages)
+		zswap_store_sub_batch(&zst);
+
+	mutex_unlock(&acomp_ctx->mutex);
+	zswap_pool_put(pool);
+check_old:
+	zswap_store_process_folio_errors(folios, errors, nr_folios);
+}
+
 bool zswap_load(struct folio *folio)
 {
 	swp_entry_t swp = folio->swap;

From patchwork Fri Oct 18 06:41:01 2024
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: "Sridhar, Kanchana P" <kanchana.p.sridhar@intel.com>
X-Patchwork-Id: 13841240
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 9BFEED3C550
	for <linux-mm@archiver.kernel.org>; Fri, 18 Oct 2024 06:41:40 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id AA4686B009E; Fri, 18 Oct 2024 02:41:16 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 9D64A6B00A3; Fri, 18 Oct 2024 02:41:16 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 5C5536B00A0; Fri, 18 Oct 2024 02:41:16 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com
 [216.40.44.17])
	by kanga.kvack.org (Postfix) with ESMTP id 2A3426B009E
	for <linux-mm@kvack.org>; Fri, 18 Oct 2024 02:41:16 -0400 (EDT)
Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1])
	by unirelay08.hostedemail.com (Postfix) with ESMTP id BCBB114059D
	for <linux-mm@kvack.org>; Fri, 18 Oct 2024 06:41:03 +0000 (UTC)
X-FDA: 82685775816.24.042F940
Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15])
	by imf24.hostedemail.com (Postfix) with ESMTP id 6F867180015
	for <linux-mm@kvack.org>; Fri, 18 Oct 2024 06:41:11 +0000 (UTC)
Authentication-Results: imf24.hostedemail.com;
	dkim=pass header.d=intel.com header.s=Intel header.b=Q+hJXlnL;
	dmarc=pass (policy=none) header.from=intel.com;
	spf=pass (imf24.hostedemail.com: domain of kanchana.p.sridhar@intel.com
 designates 192.198.163.15 as permitted sender)
 smtp.mailfrom=kanchana.p.sridhar@intel.com
ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729233566; a=rsa-sha256;
	cv=none;
	b=LqztCiVyKT3MWhOZOyW4HuAmqiSnwBQg6Lq438e8PP1dMFuC6vL79SzVSRT5enZS1/kC21
	0QDbCXPwgYLE+GOnnLpyNjoFV4wKkeLWVekHcGv4hsdSFKNgPQ7NPttjsgTnu3raiVkR/P
	KE8lGSy3wSC4RSpuSz68y0gJrz4Eg2M=
ARC-Authentication-Results: i=1;
	imf24.hostedemail.com;
	dkim=pass header.d=intel.com header.s=Intel header.b=Q+hJXlnL;
	dmarc=pass (policy=none) header.from=intel.com;
	spf=pass (imf24.hostedemail.com: domain of kanchana.p.sridhar@intel.com
 designates 192.198.163.15 as permitted sender)
 smtp.mailfrom=kanchana.p.sridhar@intel.com
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed;
 d=hostedemail.com;
	s=arc-20220608; t=1729233566;
	h=from:from:sender:reply-to:subject:subject:date:date:
	 message-id:message-id:to:to:cc:cc:mime-version:mime-version:
	 content-type:content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references:dkim-signature;
	bh=HOT1q7KKEg3hAwUZgZHqz7dL5FkeOud9qtqK/CDHBno=;
	b=Y3TXW9jwYNKNO/GIIltIHdwIuy5sLosiBUtDvUPufxHD3gZlhOGYNGrGEDSsLVcNM3Caip
	d6sGuzDPbR0oMuquiBv49P5BXnAHHuTQDV0Sea8gUODun9bIzxuaMBcymXWetAobZFnwgK
	4/wlCtfnX98kweGTN51lsf9U8QpGXgI=
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;
  d=intel.com; i=@intel.com; q=dns/txt; s=Intel;
  t=1729233673; x=1760769673;
  h=from:to:cc:subject:date:message-id:in-reply-to:
   references:mime-version:content-transfer-encoding;
  bh=BuyLq29pHWHuKPa34/0++cLrRWSOS8zwsGjaE7bn5mU=;
  b=Q+hJXlnLOCSBZOxUbOiiSNPNGieH/yDSUh4XVgsNCoz0WKXr5YVWAjwD
   HupOSg0Ko8qdp2cy2GwOmohAnOCgFLBC95GUgrPX8WVMZ4Gyie2i0eYm6
   HPWZI98J/xcS9lBaEq/VxPU+CrYjLl4Em7/3WkeWUu0MQqiv246AFnTgv
   Aqs2yG+z+SIK025zElBYeRNtUCnu3jR9mXPfW8zOdkM2sqb9svwwX33T2
   62MfZqNQhNxXVHWdqEP9vdlB3GjbxR49ReWq2+6gdGR6pbYYJpXiAa+yo
   eJcvZcZu+UsOI7jzn+w2/yVE4CZNlGxVU93gzkb+b+4s7DoZMcDVKlReZ
   w==;
X-CSE-ConnectionGUID: gzoC7/6jTCCSPho/vgy21g==
X-CSE-MsgGUID: J0jTrk2JRGydncRSZMYVEg==
X-IronPort-AV: E=McAfee;i="6700,10204,11228"; a="28884980"
X-IronPort-AV: E=Sophos;i="6.11,212,1725346800";
   d="scan'208";a="28884980"
Received: from orviesa003.jf.intel.com ([10.64.159.143])
  by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 17 Oct 2024 23:41:05 -0700
X-CSE-ConnectionGUID: SVTtz1sKTHaDZJhI1KvKbg==
X-CSE-MsgGUID: UQHOLBEfRj29Nglbp+9Wnw==
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="6.11,199,1725346800";
   d="scan'208";a="83607542"
Received: from jf5300-b11a338t.jf.intel.com ([10.242.51.6])
  by orviesa003.jf.intel.com with ESMTP; 17 Oct 2024 23:41:04 -0700
From: Kanchana P Sridhar <kanchana.p.sridhar@intel.com>
To: linux-kernel@vger.kernel.org,
	linux-mm@kvack.org,
	hannes@cmpxchg.org,
	yosryahmed@google.com,
	nphamcs@gmail.com,
	chengming.zhou@linux.dev,
	usamaarif642@gmail.com,
	ryan.roberts@arm.com,
	ying.huang@intel.com,
	21cnbao@gmail.com,
	akpm@linux-foundation.org,
	linux-crypto@vger.kernel.org,
	herbert@gondor.apana.org.au,
	davem@davemloft.net,
	clabbe@baylibre.com,
	ardb@kernel.org,
	ebiggers@google.com,
	surenb@google.com,
	kristen.c.accardi@intel.com,
	zanussi@kernel.org,
	viro@zeniv.linux.org.uk,
	brauner@kernel.org,
	jack@suse.cz,
	mcgrof@kernel.org,
	kees@kernel.org,
	joel.granados@kernel.org,
	bfoster@redhat.com,
	willy@infradead.org,
	linux-fsdevel@vger.kernel.org
Cc: wajdi.k.feghali@intel.com,
	vinodh.gopal@intel.com,
	kanchana.p.sridhar@intel.com
Subject: [RFC PATCH v1 13/13] mm: vmscan, swap,
 zswap: Compress batching of folios in shrink_folio_list().
Date: Thu, 17 Oct 2024 23:41:01 -0700
Message-Id: <20241018064101.336232-14-kanchana.p.sridhar@intel.com>
X-Mailer: git-send-email 2.27.0
In-Reply-To: <20241018064101.336232-1-kanchana.p.sridhar@intel.com>
References: <20241018064101.336232-1-kanchana.p.sridhar@intel.com>
MIME-Version: 1.0
X-Rspamd-Queue-Id: 6F867180015
X-Rspam-User: 
X-Rspamd-Server: rspam05
X-Stat-Signature: dfnuu96fkxtyc6wmpa3hcaous44ms5i6
X-HE-Tag: 1729233671-338239
X-HE-Meta: 
 U2FsdGVkX1+eFk7qCHaqA192zl1cO9BrT2hWMwdxD6+DOX4Ez93qtv/ac371QAumkMAk102/+JyRQg78vwr5tUt91oDQoQRPtyiHOSzlQHZMt9r87ROEWZRF1vg84+kAc3PyLjA+H5aqBwdA0+sXLScpJoc8FfxL2gIednUFUd4iDTL22fmRqNAmdNtCpfRGJ59LziuW4IwDCO61hEJBVAAX9MHH7daDoQzG4xFVWfquqfuI/Fx0ACElk/XM8aGxIjJI6gQAhSgwU3KRt9LRTWsUf9KooFQ+r7uFCnCKrzJaxV+wG24ZoQF+mv4b8TFjNlA1HyTUHq/KQaJM5Un1VpNPdl8LVh/5hpat9acelBHqN/icMEPfnw3qatJNf9SbW5L5wVR1zsmuesK9aZzgxMeVR2p22ClxQdrvpkj0xChaBaVOCzVxm+encRZkIduQwhMJWsrML9vaZQbCNa+ve/AE/3q4BfGKNL3qWcKJp6SnUM97JhLuFKCUJkhRtTHWOayFwbouHHemjqQ/SE4/Z92RW5lIhzJbVx3vplLp5NgemC6H9/AzwAQsx6O4kC318KEzVnOlkFrC0M6+CWg6dr8uKoCzNFCXOrjp6cCG5IQQtzt+5uz7wXgcOCJJSbpAlbRyNBj+dLyp6UWNgiBZksY7AwzhX8gtgfo3j9dZXwdLT0nnDvve3z/HwF/z6SjfdBYnbiRr1bo4NuJXeKVPFpNGzeGXzBeo+KCMwIgfSV10P9tRHulm7ogWrCoAa9FrajKxTninIMjXAjUvPstxOB+q0q7E7i9DOmuqz7/ja3zTrxz7lT5KT/RvBEuE4izh97+O4TvM/JINkk1Z2NxKnu5v6hDKrTu93yYX2wk0rkVOCvJEMkkllhKOc6lvbxxhOu+lSaO+C5ehJequ28Z7PRATdS2IfWhBE53r5kk0GH593SaF9kCw8rNCuXtDRhW6GpRxs0+9E/j3G55lOX4
 o9qJ1Osr
 azF+rcJiSuB6zHXfRw6aoxTsUKYzt6yWe+d+33HZCTVFrePyExyBFmsw9Px+4qbM5cgYxSeHiO8xkqGf0QZC3+mW6so33w6zGZ7175T8prTKPOvI8YJrFl9LCA1XLtM3iIj5QUIFXQKk8X15dHw52YqOYXHMY4v8FEoPbQiGikUsfT9CSTOKnCcva3Y+EbRpA+kL+3Z3QQtncNOQ3aRw0E3xRi6GSunZ4+UHxo7oTIrttAWA=
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>
List-Subscribe: <mailto:majordomo@kvack.org>
List-Unsubscribe: <mailto:majordomo@kvack.org>

This patch enables the use of Intel IAA hardware compression acceleration
to reclaim a batch of folios in shrink_folio_list(). This results in
reclaim throughput and workload/sys performance improvements.

The earlier patches on compress batching deployed multiple IAA compress
engines for compressing up to SWAP_CRYPTO_SUB_BATCH_SIZE pages within a
large folio that is being stored in zswap_store(). This patch further
propagates the efficiency improvements demonstrated with IAA "batching
within folios", to vmscan "batching of folios" which will also use
batching within folios using the extensible architecture of
the __zswap_store_batch_core() procedure added earlier, that accepts
an array of folios.

A plug mechanism is introduced in swap_writepage() to aggregate a batch of
up to vm.compress-batchsize ([1, 32]) folios before processing the plug.
The plug will be processed if any of the following is true:

 1) The plug has vm.compress-batchsize folios. If the system has Intel IAA,
    "sysctl vm.compress-batchsize" can be configured to be in [1, 32]. On
    systems without IAA, or if CONFIG_ZSWAP_STORE_BATCHING_ENABLED is not
    set, "sysctl vm.compress-batchsize" can only be 1.
 2) A folio of a different swap type or folio_nid as the current folios in
    the plug, needs to be added to the plug.
 3) A pmd-mappable folio needs to be swapped out. In this case, the
    existing folios in the plug are processed. The pmd-mappable folio is
    swapped out (zswap_store() will batch compress
    SWAP_CRYPTO_SUB_BATCH_SIZE pages in the pmd-mappable folio if system
    has IAA) in a batch of its own.

From zswap's perspective, it now receives a hybrid batch of any-order
(non-pmd-mappable) folios when the plug is processed via
zswap_store_batch() that calls __zswap_store_batch_core(). This makes sure
that the zswap compress batching pipeline occupancy and reclaim throughput
are maximized.

The shrink_folio_list() interface with swap_writepage() is modified to
work with the plug mechanism. When shrink_folio_list() calls pageout(), it
needs to handle new return codes from pageout(), namely, PAGE_BATCHED and
PAGE_BATCH_SUCCESS:

PAGE_BATCHED:
  The page is not yet swapped out, so we need to wait for the "imc_plug"
  batch to be processed, before running the post-pageout computes in
  shrink_folio_list().

PAGE_BATCH_SUCCESS:
  When the "imc_plug" is processed in swap_writepage(), a newly added
  status "AOP_PAGE_BATCH_SUCCESS" is returned to pageout(), which in turn
  returns PAGE_BATCH_SUCCESS to shrink_folio_list().

Upon receiving PAGE_BATCH_SUCCESS from pageout(), shrink_folio_list() must
now serialize and run the post-pageout computes for all the folios in
"imc_plug". To summarize this approach, this patch introduces a plug in
reclaim that aggregates a batch of folios, parallelizes the zswap store of
the folios using IAA hardware acceleration, then returns to run the
serialized flow after the "batch pageout".

The patch attempts to do this with some minimal/necessary amount of code
duplication and by addition of an iteration through the "imc_plug" folios
in shrink_folio_list(). I have validated this extensively, and not seen any
issues. I would appreciate suggestions to improve upon this approach.

Submitting this functionality as a single distinct patch in the RFC
patch-series because all the changes in this specific patch are for
shrink_folio_list() batching; they wouldn't make sense without the
functionality in this patch. Besides the functionality itself, I would also
appreciate comments on whether the patch needs to be organized
differently.

Thanks Ying Huang for suggesting ideas on simplifying the vmscan interface
to the swap_writepage() plug mechanism.

Suggested-by: Ying Huang <ying.huang@intel.com>
Signed-off-by: Kanchana P Sridhar <kanchana.p.sridhar@intel.com>
---
 include/linux/fs.h        |   2 +
 include/linux/mm.h        |   8 ++
 include/linux/writeback.h |   5 ++
 include/linux/zswap.h     |  16 ++++
 kernel/sysctl.c           |   9 +++
 mm/page_io.c              | 152 ++++++++++++++++++++++++++++++++++++-
 mm/swap.c                 |  15 ++++
 mm/swap.h                 |  40 ++++++++++
 mm/vmscan.c               | 154 +++++++++++++++++++++++++++++++-------
 mm/zswap.c                |  20 +++++
 10 files changed, 394 insertions(+), 27 deletions(-)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 3559446279c1..2868925568a5 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -303,6 +303,8 @@ struct iattr {
 enum positive_aop_returns {
 	AOP_WRITEPAGE_ACTIVATE	= 0x80000,
 	AOP_TRUNCATED_PAGE	= 0x80001,
+	AOP_PAGE_BATCHED	= 0x80002,
+	AOP_PAGE_BATCH_SUCCESS	= 0x80003,
 };
 
 /*
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 4c32003c8404..a8035e163793 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -80,6 +80,14 @@ extern void * high_memory;
 extern int page_cluster;
 extern const int page_cluster_max;
 
+/*
+ * Compress batching of any-order folios in the reclaim path with IAA.
+ * The number of folios to batch reclaim can be set through
+ * "sysctl vm.compress-batchsize" which can be a value in [1, 32].
+ */
+extern int compress_batchsize;
+extern const int compress_batchsize_max;
+
 #ifdef CONFIG_SYSCTL
 extern int sysctl_legacy_va_layout;
 #else
diff --git a/include/linux/writeback.h b/include/linux/writeback.h
index d6db822e4bb3..41629ea5699d 100644
--- a/include/linux/writeback.h
+++ b/include/linux/writeback.h
@@ -82,6 +82,11 @@ struct writeback_control {
 	/* Target list for splitting a large folio */
 	struct list_head *list;
 
+	/*
+	 * Plug for storing reclaim folios for compress batching.
+	 */
+	struct swap_in_memory_cache_cb *swap_in_memory_cache_plug;
+
 	/* internal fields used by the ->writepages implementation: */
 	struct folio_batch fbatch;
 	pgoff_t index;
diff --git a/include/linux/zswap.h b/include/linux/zswap.h
index 9bbe330686f6..328a1e09d502 100644
--- a/include/linux/zswap.h
+++ b/include/linux/zswap.h
@@ -11,6 +11,8 @@ extern atomic_long_t zswap_stored_pages;
 
 #ifdef CONFIG_ZSWAP
 
+struct swap_in_memory_cache_cb;
+
 struct zswap_lruvec_state {
 	/*
 	 * Number of swapped in pages from disk, i.e not found in the zswap pool.
@@ -107,6 +109,15 @@ struct zswap_store_pipeline_state {
 };
 
 bool zswap_store_batching_enabled(void);
+void __zswap_store_batch(struct swap_in_memory_cache_cb *simc);
+void __zswap_store_batch_single(struct swap_in_memory_cache_cb *simc);
+static inline void zswap_store_batch(struct swap_in_memory_cache_cb *simc)
+{
+	if (zswap_store_batching_enabled())
+		__zswap_store_batch(simc);
+	else
+		__zswap_store_batch_single(simc);
+}
 unsigned long zswap_total_pages(void);
 bool zswap_store(struct folio *folio);
 bool zswap_load(struct folio *folio);
@@ -123,12 +134,17 @@ bool zswap_never_enabled(void);
 struct zswap_lruvec_state {};
 struct zswap_store_sub_batch_page {};
 struct zswap_store_pipeline_state {};
+struct swap_in_memory_cache_cb;
 
 static inline bool zswap_store_batching_enabled(void)
 {
 	return false;
 }
 
+static inline void zswap_store_batch(struct swap_in_memory_cache_cb *simc)
+{
+}
+
 static inline bool zswap_store(struct folio *folio)
 {
 	return false;
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 79e6cb1d5c48..b8d6b599e9ae 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -2064,6 +2064,15 @@ static struct ctl_table vm_table[] = {
 		.extra1		= SYSCTL_ZERO,
 		.extra2		= (void *)&page_cluster_max,
 	},
+	{
+		.procname	= "compress-batchsize",
+		.data		= &compress_batchsize,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= SYSCTL_ONE,
+		.extra2		= (void *)&compress_batchsize_max,
+	},
 	{
 		.procname	= "dirtytime_expire_seconds",
 		.data		= &dirtytime_expire_interval,
diff --git a/mm/page_io.c b/mm/page_io.c
index a28d28b6b3ce..065db25309b8 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -226,6 +226,131 @@ static void swap_zeromap_folio_clear(struct folio *folio)
 	}
 }
 
+/*
+ * For batching of folios in reclaim path for zswap batch compressions
+ * with Intel IAA.
+ */
+static void simc_write_in_memory_cache_complete(
+	struct swap_in_memory_cache_cb *simc,
+	struct writeback_control *wbc)
+{
+	int i;
+
+	/* All elements of a plug write batch have the same swap type. */
+	struct swap_info_struct *sis = swp_swap_info(simc->folios[0]->swap);
+
+	VM_BUG_ON(!sis);
+
+	for (i = 0; i < simc->nr_folios; ++i) {
+		struct folio *folio = simc->folios[i];
+
+		if (!simc->errors[i]) {
+			count_mthp_stat(folio_order(folio), MTHP_STAT_ZSWPOUT);
+			folio_unlock(folio);
+		} else {
+			__swap_writepage(simc->folios[i], wbc);
+		}
+	}
+}
+
+void swap_write_in_memory_cache_unplug(struct swap_in_memory_cache_cb *simc,
+				       struct writeback_control *wbc)
+{
+	unsigned long pflags;
+
+	psi_memstall_enter(&pflags);
+
+	zswap_store_batch(simc);
+
+	simc_write_in_memory_cache_complete(simc, wbc);
+
+	psi_memstall_leave(&pflags);
+
+	simc->processed = true;
+}
+
+/*
+ * Only called by swap_writepage() if (wbc && wbc->swap_in_memory_cache_plug)
+ * is true i.e., from shrink_folio_list()->pageout() path.
+ */
+static bool swap_writepage_in_memory_cache(struct folio *folio,
+					   struct writeback_control *wbc)
+{
+	struct swap_in_memory_cache_cb *simc;
+	unsigned type = swp_type(folio->swap);
+	int node_id = folio_nid(folio);
+	int comp_batch_size = READ_ONCE(compress_batchsize);
+	bool ret = false;
+
+	simc = wbc->swap_in_memory_cache_plug;
+
+	if ((simc->nr_folios > 0) &&
+			((simc->type != type) || (simc->node_id != node_id) ||
+			folio_test_pmd_mappable(folio) ||
+			(simc->nr_folios == comp_batch_size))) {
+		swap_write_in_memory_cache_unplug(simc, wbc);
+		ret = true;
+		simc->next_batch_folio = folio;
+	} else {
+		simc->type = type;
+		simc->node_id = node_id;
+		simc->folios[simc->nr_folios] = folio;
+
+		/*
+		 * If zswap successfully stores a page, it should set
+		 * simc->errors[] to 0.
+		 */
+		simc->errors[simc->nr_folios] = -1;
+		simc->nr_folios++;
+	}
+
+	return ret;
+}
+
+void swap_writepage_in_memory_cache_transition(void *arg)
+{
+	struct swap_in_memory_cache_cb *simc =
+		(struct swap_in_memory_cache_cb *) arg;
+	simc->nr_folios = 0;
+	simc->processed = false;
+
+	if (simc->next_batch_folio) {
+		struct folio *folio = simc->next_batch_folio;
+		simc->folios[simc->nr_folios] = folio;
+		simc->type = swp_type(folio->swap);
+		simc->node_id = folio_nid(folio);
+		simc->next_batch_folio = NULL;
+
+		/*
+		 * If zswap successfully stores a page, it should set
+		 * simc->errors[] to 0.
+		 */
+		simc->errors[simc->nr_folios] = -1;
+		simc->nr_folios++;
+	}
+}
+
+void swap_writepage_in_memory_cache_init(void *arg)
+{
+	struct swap_in_memory_cache_cb *simc =
+		(struct swap_in_memory_cache_cb *) arg;
+
+	simc->nr_folios = 0;
+	simc->processed = false;
+	simc->next_batch_folio = NULL;
+	simc->transition = &swap_writepage_in_memory_cache_transition;
+}
+
+/*
+ * zswap batching of folios with IAA:
+ *
+ * Reclaim batching note for pmd-mappable folios:
+ * Any pmd-mappable folio in the reclaim path will be processed in a batch
+ * comprising only that folio. There will be no mixed batches containing
+ * pmd-mappable folios for batch compression with IAA.
+ * There are no restrictions with other large folios: a reclaim batch
+ * can comprise of any-order mix of non-pmd-mappable folios.
+ */
 /*
  * We may have stale swap cache pages in memory: notice
  * them here and get rid of the unnecessary final write.
@@ -268,7 +393,32 @@ int swap_writepage(struct page *page, struct writeback_control *wbc)
 		 */
 		swap_zeromap_folio_clear(folio);
 	}
-	if (zswap_store(folio)) {
+
+	/*
+	 * Batching of compressions with IAA: If reclaim path pageout has
+	 * invoked swap_writepage with a wbc->swap_in_memory_cache_plug,
+	 * add the page to the plug, or invoke zswap_store_batch() if
+	 * "vm.compress-batchsize" elements have been stored in
+	 * the plug.
+	 *
+	 * If swap_writepage has been called from other kernel code without
+	 * a wbc->swap_in_memory_cache_plug, call zswap_store() with the folio
+	 * (i.e. without adding the folio to a plug for batch processing).
+	 */
+	if (wbc && wbc->swap_in_memory_cache_plug) {
+		if (!mem_cgroup_zswap_writeback_enabled(folio_memcg(folio)) &&
+			!zswap_is_enabled() &&
+			folio_memcg(folio) &&
+			!READ_ONCE(folio_memcg(folio)->zswap_writeback)) {
+			folio_mark_dirty(folio);
+			return AOP_WRITEPAGE_ACTIVATE;
+		}
+
+		if (swap_writepage_in_memory_cache(folio, wbc))
+			return AOP_PAGE_BATCH_SUCCESS;
+		else
+			return AOP_PAGE_BATCHED;
+	} else if (zswap_store(folio)) {
 		count_mthp_stat(folio_order(folio), MTHP_STAT_ZSWPOUT);
 		folio_unlock(folio);
 		return 0;
diff --git a/mm/swap.c b/mm/swap.c
index 835bdf324b76..095630d6c35e 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -38,6 +38,7 @@
 #include <linux/local_lock.h>
 #include <linux/buffer_head.h>
 
+#include "swap.h"
 #include "internal.h"
 
 #define CREATE_TRACE_POINTS
@@ -47,6 +48,14 @@
 int page_cluster;
 const int page_cluster_max = 31;
 
+/*
+ * Number of pages in a reclaim batch for pageout.
+ * If zswap is enabled, this is the batch-size for zswap
+ * compress batching of multiple any-order folios.
+ */
+int compress_batchsize;
+const int compress_batchsize_max = SWAP_CRYPTO_MAX_COMP_BATCH_SIZE;
+
 struct cpu_fbatches {
 	/*
 	 * The following folio batches are grouped together because they are protected
@@ -1105,4 +1114,10 @@ void __init swap_setup(void)
 	 * Right now other parts of the system means that we
 	 * _really_ don't want to cluster much more
 	 */
+
+	/*
+	 * Initialize the number of pages in a reclaim batch
+	 * for pageout.
+	 */
+	compress_batchsize = 1;
 }
diff --git a/mm/swap.h b/mm/swap.h
index 4dcb67e2cc33..08c04954304f 100644
--- a/mm/swap.h
+++ b/mm/swap.h
@@ -20,6 +20,13 @@ struct mempolicy;
 #define SWAP_CRYPTO_SUB_BATCH_SIZE 1UL
 #endif
 
+/* Set the vm.compress-batchsize limits. */
+#if defined(CONFIG_ZSWAP_STORE_BATCHING_ENABLED)
+#define SWAP_CRYPTO_MAX_COMP_BATCH_SIZE SWAP_CLUSTER_MAX
+#else
+#define SWAP_CRYPTO_MAX_COMP_BATCH_SIZE 1UL
+#endif
+
 /* linux/mm/swap_state.c, zswap.c */
 struct crypto_acomp_ctx {
 	struct crypto_acomp *acomp;
@@ -53,6 +60,20 @@ void swap_crypto_acomp_compress_batch(
 	int nr_pages,
 	struct crypto_acomp_ctx *acomp_ctx);
 
+/* linux/mm/vmscan.c, linux/mm/page_io.c, linux/mm/zswap.c */
+/* For batching of compressions in reclaim path. */
+struct swap_in_memory_cache_cb {
+	unsigned int type;
+	int node_id;
+	struct folio *folios[SWAP_CLUSTER_MAX];
+	int errors[SWAP_CLUSTER_MAX];
+	unsigned int nr_folios;
+	bool processed;
+	struct folio *next_batch_folio;
+	void (*transition)(void *);
+	void (*init)(void *);
+};
+
 /* linux/mm/page_io.c */
 int sio_pool_init(void);
 struct swap_iocb;
@@ -63,6 +84,10 @@ static inline void swap_read_unplug(struct swap_iocb *plug)
 	if (unlikely(plug))
 		__swap_read_unplug(plug);
 }
+void swap_writepage_in_memory_cache_init(void *arg);
+void swap_writepage_in_memory_cache_transition(void *arg);
+void swap_write_in_memory_cache_unplug(struct swap_in_memory_cache_cb *simc,
+				       struct writeback_control *wbc);
 void swap_write_unplug(struct swap_iocb *sio);
 int swap_writepage(struct page *page, struct writeback_control *wbc);
 void __swap_writepage(struct folio *folio, struct writeback_control *wbc);
@@ -164,6 +189,21 @@ static inline void swap_crypto_acomp_compress_batch(
 {
 }
 
+struct swap_in_memory_cache_cb {};
+static void swap_writepage_in_memory_cache_init(void *arg)
+{
+}
+
+static void swap_writepage_in_memory_cache_transition(void *arg)
+{
+}
+
+static inline void swap_write_in_memory_cache_unplug(
+	struct swap_in_memory_cache_cb *simc,
+	struct writeback_control *wbc)
+{
+}
+
 static inline void swap_read_folio(struct folio *folio, struct swap_iocb **plug)
 {
 }
diff --git a/mm/vmscan.c b/mm/vmscan.c
index fd3908d43b07..145e6cde78cd 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -619,6 +619,13 @@ typedef enum {
 	PAGE_ACTIVATE,
 	/* folio has been sent to the disk successfully, folio is unlocked */
 	PAGE_SUCCESS,
+	/*
+	 * reclaim folio batch has been sent to swap successfully,
+	 * folios are unlocked
+	 */
+	PAGE_BATCH_SUCCESS,
+	/* folio has been added to the reclaim batch. */
+	PAGE_BATCHED,
 	/* folio is clean and locked */
 	PAGE_CLEAN,
 } pageout_t;
@@ -628,7 +635,8 @@ typedef enum {
  * Calls ->writepage().
  */
 static pageout_t pageout(struct folio *folio, struct address_space *mapping,
-			 struct swap_iocb **plug, struct list_head *folio_list)
+			 struct swap_iocb **plug, struct list_head *folio_list,
+			 struct swap_in_memory_cache_cb *imc_plug)
 {
 	/*
 	 * If the folio is dirty, only perform writeback if that write
@@ -674,6 +682,7 @@ static pageout_t pageout(struct folio *folio, struct address_space *mapping,
 			.range_end = LLONG_MAX,
 			.for_reclaim = 1,
 			.swap_plug = plug,
+			.swap_in_memory_cache_plug = imc_plug,
 		};
 
 		/*
@@ -693,6 +702,23 @@ static pageout_t pageout(struct folio *folio, struct address_space *mapping,
 			return PAGE_ACTIVATE;
 		}
 
+		if (res == AOP_PAGE_BATCHED)
+			return PAGE_BATCHED;
+
+		if (res == AOP_PAGE_BATCH_SUCCESS) {
+			int r;
+			for (r = 0; r < imc_plug->nr_folios; ++r) {
+				struct folio *rfolio = imc_plug->folios[r];
+				if (!folio_test_writeback(rfolio)) {
+					/* synchronous write or broken a_ops? */
+					folio_clear_reclaim(rfolio);
+				}
+				trace_mm_vmscan_write_folio(rfolio);
+				node_stat_add_folio(rfolio, NR_VMSCAN_WRITE);
+			}
+			return PAGE_BATCH_SUCCESS;
+		}
+
 		if (!folio_test_writeback(folio)) {
 			/* synchronous write or broken a_ops? */
 			folio_clear_reclaim(folio);
@@ -1035,6 +1061,12 @@ static bool may_enter_fs(struct folio *folio, gfp_t gfp_mask)
 	return !data_race(folio_swap_flags(folio) & SWP_FS_OPS);
 }
 
+static __always_inline bool reclaim_batch_being_processed(
+	struct swap_in_memory_cache_cb *imc_plug)
+{
+	return imc_plug->nr_folios && imc_plug->processed;
+}
+
 /*
  * shrink_folio_list() returns the number of reclaimed pages
  */
@@ -1049,22 +1081,54 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
 	unsigned int pgactivate = 0;
 	bool do_demote_pass;
 	struct swap_iocb *plug = NULL;
+	struct swap_in_memory_cache_cb imc_plug;
+	bool imc_plug_path = false;
+	struct folio *folio;
+	int r;
 
+	imc_plug.init = &swap_writepage_in_memory_cache_init;
+	imc_plug.init(&imc_plug);
 	folio_batch_init(&free_folios);
 	memset(stat, 0, sizeof(*stat));
 	cond_resched();
 	do_demote_pass = can_demote(pgdat->node_id, sc);
 
 retry:
-	while (!list_empty(folio_list)) {
+	while (!list_empty(folio_list) || (imc_plug.nr_folios && !imc_plug.processed)) {
 		struct address_space *mapping;
-		struct folio *folio;
 		enum folio_references references = FOLIOREF_RECLAIM;
 		bool dirty, writeback;
 		unsigned int nr_pages;
+		imc_plug_path = false;
 
 		cond_resched();
 
+		/* Reclaim path zswap/zram batching using IAA. */
+		if (list_empty(folio_list)) {
+			struct writeback_control wbc = {
+				.sync_mode = WB_SYNC_NONE,
+				.nr_to_write = SWAP_CLUSTER_MAX,
+				.range_start = 0,
+				.range_end = LLONG_MAX,
+				.for_reclaim = 1,
+				.swap_plug = &plug,
+				.swap_in_memory_cache_plug = &imc_plug,
+			};
+
+			swap_write_in_memory_cache_unplug(&imc_plug, &wbc);
+
+			for (r = 0; r < imc_plug.nr_folios; ++r) {
+				struct folio *rfolio = imc_plug.folios[r];
+				if (!folio_test_writeback(rfolio)) {
+					/* synchronous write or broken a_ops? */
+					folio_clear_reclaim(rfolio);
+				}
+				trace_mm_vmscan_write_folio(rfolio);
+				node_stat_add_folio(rfolio, NR_VMSCAN_WRITE);
+			}
+			goto serialize_post_batch_pageout;
+		}
+
 		folio = lru_to_folio(folio_list);
 		list_del(&folio->lru);
 
@@ -1363,7 +1427,7 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
 			 * starts and then write it out here.
 			 */
 			try_to_unmap_flush_dirty();
-			switch (pageout(folio, mapping, &plug, folio_list)) {
+			switch (pageout(folio, mapping, &plug, folio_list, &imc_plug)) {
 			case PAGE_KEEP:
 				goto keep_locked;
 			case PAGE_ACTIVATE:
@@ -1377,34 +1441,66 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
 					nr_pages = 1;
 				}
 				goto activate_locked;
+			case PAGE_BATCHED:
+				continue;
 			case PAGE_SUCCESS:
-				if (nr_pages > 1 && !folio_test_large(folio)) {
-					sc->nr_scanned -= (nr_pages - 1);
-					nr_pages = 1;
-				}
-				stat->nr_pageout += nr_pages;
-
-				if (folio_test_writeback(folio))
-					goto keep;
-				if (folio_test_dirty(folio))
-					goto keep;
-
-				/*
-				 * A synchronous write - probably a ramdisk.  Go
-				 * ahead and try to reclaim the folio.
-				 */
-				if (!folio_trylock(folio))
-					goto keep;
-				if (folio_test_dirty(folio) ||
-				    folio_test_writeback(folio))
-					goto keep_locked;
-				mapping = folio_mapping(folio);
-				fallthrough;
+				goto post_single_pageout;
+			case PAGE_BATCH_SUCCESS:
+				goto serialize_post_batch_pageout;
 			case PAGE_CLEAN:
+				goto folio_is_clean;
 				; /* try to free the folio below */
 			}
+		} else {
+			goto folio_is_clean;
+		}
+
+serialize_post_batch_pageout:
+		imc_plug_path = reclaim_batch_being_processed(&imc_plug);
+		if (!imc_plug_path) {
+			pr_err("imc_plug: type %u node_id %d \
+				nr_folios %u processed %d next_batch_folio %px",
+				imc_plug.type, imc_plug.node_id,
+				imc_plug.nr_folios, imc_plug.processed,
+				imc_plug.next_batch_folio);
+		}
+		BUG_ON(!imc_plug_path);
+		r = -1;
+
+next_folio_in_batch:
+		while (++r < imc_plug.nr_folios) {
+			folio = imc_plug.folios[r];
+			goto post_single_pageout;
+		} /* while imc_plug folios. */
+
+		imc_plug.transition(&imc_plug);
+		continue;
+
+post_single_pageout:
+		mapping = folio_mapping(folio);
+		nr_pages = folio_nr_pages(folio);
+		if (nr_pages > 1 && !folio_test_large(folio)) {
+			sc->nr_scanned -= (nr_pages - 1);
+			nr_pages = 1;
 		}
+		stat->nr_pageout += nr_pages;
+
+		if (folio_test_writeback(folio))
+			goto keep;
+		if (folio_test_dirty(folio))
+			goto keep;
+
+		/*
+		 * A synchronous write - probably a ramdisk.  Go
+		 * ahead and try to reclaim the folio.
+		 */
+		if (!folio_trylock(folio))
+			goto keep;
+		if (folio_test_dirty(folio) ||
+		    folio_test_writeback(folio))
+			goto keep_locked;
 
+folio_is_clean:
 		/*
 		 * If the folio has buffers, try to free the buffer
 		 * mappings associated with this folio. If we succeed
@@ -1444,6 +1540,8 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
 					 * leave it off the LRU).
 					 */
 					nr_reclaimed += nr_pages;
+					if (imc_plug_path)
+						goto next_folio_in_batch;
 					continue;
 				}
 			}
@@ -1481,6 +1579,8 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
 			try_to_unmap_flush();
 			free_unref_folios(&free_folios);
 		}
+		if (imc_plug_path)
+			goto next_folio_in_batch;
 		continue;
 
 activate_locked_split:
@@ -1510,6 +1610,8 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
 		list_add(&folio->lru, &ret_folios);
 		VM_BUG_ON_FOLIO(folio_test_lru(folio) ||
 				folio_test_unevictable(folio), folio);
+		if (imc_plug_path)
+			goto next_folio_in_batch;
 	}
 	/* 'folio_list' is always empty here */
 
diff --git a/mm/zswap.c b/mm/zswap.c
index 1c12a7b9f4ff..68ce498ad000 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -1666,6 +1666,26 @@ bool zswap_store(struct folio *folio)
 	return ret;
 }
 
+/*
+ * The batch contains <= vm.compress-batchsize nr of folios.
+ * All folios in the batch have the same swap type and folio_nid.
+ */
+void __zswap_store_batch(struct swap_in_memory_cache_cb *simc)
+{
+	__zswap_store_batch_core(simc->node_id, simc->folios,
+				 simc->errors, simc->nr_folios);
+}
+
+void __zswap_store_batch_single(struct swap_in_memory_cache_cb *simc)
+{
+	u8 i;
+
+	for (i = 0; i < simc->nr_folios; ++i) {
+		if (zswap_store(simc->folios[i]))
+			simc->errors[i] = 0;
+	}
+}
+
 /*
  * Note: If SWAP_CRYPTO_SUB_BATCH_SIZE exceeds 256, change the
  * u8 stack variables in the next several functions, to u16.