From patchwork Fri May 25 12:45:24 2018
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: "Zhang, Lei" <zhang.lei@jp.fujitsu.com>
X-Patchwork-Id: 10427379
Return-Path: 
 <linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
	[172.30.200.125])
	by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id
	33F426025B for <patchwork-linux-arm@patchwork.kernel.org>;
	Fri, 25 May 2018 12:55:33 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2096529183
	for <patchwork-linux-arm@patchwork.kernel.org>;
	Fri, 25 May 2018 12:55:33 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id 159F529194; Fri, 25 May 2018 12:55:33 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,DKIM_SIGNED,
	DKIM_VALID,MAILING_LIST_MULTI autolearn=ham version=3.3.1
Received: from bombadil.infradead.org (bombadil.infradead.org
	[198.137.202.133])
	(using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 558BA29183
	for <patchwork-linux-arm@patchwork.kernel.org>;
	Fri, 25 May 2018 12:55:32 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed;
	d=lists.infradead.org; s=bombadil.20170209; h=Sender:
	Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post:
	List-Archive:List-Unsubscribe:List-Id:MIME-Version:Message-ID:Date:Subject:To
	:From:Reply-To:Cc:Content-ID:Content-Description:Resent-Date:Resent-From:
	Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References:
	List-Owner; bh=7GFjEtmqUuM1X1p1xfXBmfYIL8hM/HMBwk8StmV0O+I=;
	b=eKxA0vf3VwG0Fu
	C3DWPoTvl/LwVlY0FbMe9Izh7AjPXdLCiRgvq/8OamfXhB0ArRm9SZpn0GZpp/i1M+j3+LcAwEICL
	j69TuufTLh4Qa7PAs4ZP5UsUeCLmlrjmsfX//HubdH7WlvFcfy6PcU4WV6yerQpV5wONFV3IRklLn
	GaxDvnTWgSFQgyv8J4jGNNihPdHBbBcQyUD8TMt0u5gmJB3s9FwIh5zFbtSChiwdEUrvveZF3ybO5
	X71sMsPzb7m8FzZUyEN0c6UoLS3At6sXofpvYz3lTuY7CSvxAZPiEQhv/xD4z9pgHMDxNiSur0MEO
	rkv26SARyUo+mBjWKIXQ==;
Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org)
	by bombadil.infradead.org with esmtp (Exim 4.90_1 #2 (Red Hat Linux))
	id 1fMCFd-0006w2-AN; Fri, 25 May 2018 12:55:21 +0000
Received: from casper.infradead.org ([85.118.1.10])
	by bombadil.infradead.org with esmtps (Exim 4.90_1 #2 (Red Hat
	Linux)) id 1fMCC4-0004Uh-DF
	for linux-arm-kernel@bombadil.infradead.org;
	Fri, 25 May 2018 12:51:40 +0000
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed;
	d=infradead.org; s=casper.20170209;
	h=MIME-Version:Content-Transfer-Encoding:
	Content-Type:Message-ID:Date:Subject:To:From:Sender:Reply-To:Cc:Content-ID:
	Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc
	:Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe:
	List-Subscribe:List-Post:List-Owner:List-Archive;
	bh=pMgEVMC8ZciVqHSpr3GaV9ZZuSQjdNt1bOA8/IJfET0=;
	b=lKix43MIs1IsvHndRTT7Gqni00
	QYy76/2iG5Oj+b7jlDfPqAE9xjqC2QptM4chumX0TXeBoCPFBeSb3S1k8X21Y2z3OT4HImJUQ/vL0
	GroWAuSi5D40PkJ8MVjDcSj6vi5g20iIaW4I229UU+F8qkMzBTwL21cv3FvPMZ4Poh0d8bfVWEpFz
	AdvsXkZ2stCS2461XvXW2xeLUjQOZM7ngDWYCvA4/FUeKSL9sJeBBviD8gICLcnZUYMkEqIRh+Yy0
	2fbuWsAFJKw6rhWli98iirDWTO7Jd575PbTjIdYbd4Lw/TroB8ZXQATtzrTT4nUBWp8INZem6rz/W
	raXomWVw==;
Received: from mgwkm03.jp.fujitsu.com ([202.219.69.170])
	by casper.infradead.org with esmtps (Exim 4.90_1 #2 (Red Hat Linux))
	id 1fMC6T-0006Fz-Gi for linux-arm-kernel@lists.infradead.org;
	Fri, 25 May 2018 12:45:56 +0000
Received: from kw-mxauth.gw.nic.fujitsu.com (unknown [192.168.231.132]) by
	mgwkm03.jp.fujitsu.com with smtp
	id 3928_07ce_049aa606_51ad_4433_8489_5379fae62144;
	Fri, 25 May 2018 21:45:28 +0900
Received: from g01jpfmpwkw01.exch.g01.fujitsu.local
	(g01jpfmpwkw01.exch.g01.fujitsu.local [10.0.193.38])
	by kw-mxauth.gw.nic.fujitsu.com (Postfix) with ESMTP id B05F1AC017D
	for <linux-arm-kernel@lists.infradead.org>;
	Fri, 25 May 2018 21:45:26 +0900 (JST)
Received: from G01JPEXCHKW16.g01.fujitsu.local
	(G01JPEXCHKW16.g01.fujitsu.local [10.0.194.55])
	by g01jpfmpwkw01.exch.g01.fujitsu.local (Postfix) with ESMTP id
	0D6F76923B7; Fri, 25 May 2018 21:45:26 +0900 (JST)
Received: from G01JPEXMBKW03.g01.fujitsu.local ([10.0.194.67]) by
	g01jpexchkw16 ([10.0.194.55]) with mapi id 14.03.0352.000;
	Fri, 25 May 2018 21:45:25 +0900
From: "Zhang, Lei" <zhang.lei@jp.fujitsu.com>
To: "'marc.zyngier@arm.com'" <marc.zyngier@arm.com>,
	"'linux-arm-kernel@lists.infradead.org'"
	<linux-arm-kernel@lists.infradead.org>
Subject: [PATCH v2]irqchip/irq-gic-v3:Avoid a waste of LPI resource
Thread-Topic: [PATCH v2]irqchip/irq-gic-v3:Avoid a waste of LPI resource
Thread-Index: AdP0JjLJfmmzjANESgCZIXb1//C1uQ==
Date: Fri, 25 May 2018 12:45:24 +0000
Message-ID: <8898674D84E3B24BA3A2D289B872026A69F1300F@G01JPEXMBKW03>
Accept-Language: ja-JP, en-US
Content-Language: ja-JP
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-securitypolicycheck: OK by SHieldMailChecker v2.2.3
x-shieldmailcheckerpolicyversion: FJ-ISEC-20140219
x-originating-ip: [10.18.70.198]
MIME-Version: 1.0
X-SecurityPolicyCheck-GC: OK by FENCE-Mail
X-TM-AS-MML: disable
X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 
X-CRM114-CacheID: sfid-20180525_134554_282555_039F71FC 
X-CRM114-Status: GOOD (  23.15  )
X-BeenThere: linux-arm-kernel@lists.infradead.org
X-Mailman-Version: 2.1.21
Precedence: list
List-Id: <linux-arm-kernel.lists.infradead.org>
List-Unsubscribe: 
 <http://lists.infradead.org/mailman/options/linux-arm-kernel>,
	<mailto:linux-arm-kernel-request@lists.infradead.org?subject=unsubscribe>
List-Archive: <http://lists.infradead.org/pipermail/linux-arm-kernel/>
List-Post: <mailto:linux-arm-kernel@lists.infradead.org>
List-Help: <mailto:linux-arm-kernel-request@lists.infradead.org?subject=help>
List-Subscribe: 
 <http://lists.infradead.org/mailman/listinfo/linux-arm-kernel>,
	<mailto:linux-arm-kernel-request@lists.infradead.org?subject=subscribe>
Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org>
Errors-To: 
 linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org
X-Virus-Scanned: ClamAV using ClamSMTP

The current implementation of irq-gic-v3-its driver allocates at least 32 LPIs (interrupt numbers) 
for each Device ID even if the number of requested LPIs is only one.
I think it is a waste for LPI resource.
And if we use many devices over ITS, this implementation may cause a shortage of LPI .

I have a patch to avoid this problem by changing method of lpis management.
For detail, I use free list instead of chunk method to manage lpis.

The points of this patch are as follows.
Point1:Not always allocates at least 32 LPIs.
Round numbers of lpi requested up to nearest power of two.
Point2: Guarantee base lpi number is aligned a power of two.
For example if you want 2 lpis, you will get 2 lpis, and base lpi number will be aligned by 2.
If you want 15 lpis, you will get 16 lpis, and base lpi number will be aligned by 16.
Point3: Lpis be allocated as a contiguous range,

Signed-off-by: Lei Zhang <zhang.lei@jp.fujitsu.com>
------------------------------------------------
+static void its_joint_free_list(struct lpi_mng *free, struct lpi_mng 
+*alloc) {
+	free->len = free->len * 2;
+	if (free->base > alloc->base)
+		free->base = alloc->base;
+}
+
 static void its_lpi_free_chunks(unsigned long *bitmap, int base, int nr_ids)  {
-	int lpi;
+	struct lpi_mng *lpi_alloc_mng = NULL;
+	struct lpi_mng *lpi_free_mng = NULL;
+	bool first_half;
+	int pair_base;
 
 	spin_lock(&lpi_lock);
 
-	for (lpi = base; lpi < (base + nr_ids); lpi += IRQS_PER_CHUNK) {
-		int chunk = its_lpi_to_chunk(lpi);
-
-		BUG_ON(chunk > lpi_chunks);
-		if (test_bit(chunk, lpi_bitmap)) {
-			clear_bit(chunk, lpi_bitmap);
-		} else {
-			pr_err("Bad LPI chunk %d\n", chunk);
+	list_for_each_entry(lpi_alloc_mng, &lpi_alloc_list, lpi_list) {
+		if (lpi_alloc_mng->base == base) {
+			list_del_init(&lpi_alloc_mng->lpi_list);
+			break;
 		}
 	}
 
+	first_half = (lpi_alloc_mng->base % (lpi_alloc_mng->len * 2))
+			 ? false : true;
+	if (first_half)
+		pair_base = lpi_alloc_mng->base + lpi_alloc_mng->len;
+	else
+		pair_base = lpi_alloc_mng->base - lpi_alloc_mng->len;
+
+	// found the other half
+	list_for_each_entry(lpi_free_mng, &lpi_free_list, lpi_list) {
+		if (lpi_free_mng->base == pair_base) {
+			its_joint_free_list(lpi_free_mng, lpi_alloc_mng);
+			kfree(lpi_alloc_mng);
+			goto out;
+		}
+	}
+	// Not found the other half
+	list_for_each_entry(lpi_free_mng, &lpi_free_list, lpi_list) {
+		if (lpi_alloc_mng->base  < lpi_free_mng->base) {
+			list_add_tail(&lpi_alloc_mng->lpi_list,
+				&lpi_free_mng->lpi_list);
+			break;
+		}
+	}
+out:
 	spin_unlock(&lpi_lock);
 
 	kfree(bitmap);
@@ -2117,12 +2187,13 @@ static struct its_device *its_create_device(struct its_node *its, u32 dev_id,
 	 * We allocate at least one chunk worth of LPIs bet device,
 	 * and thus that many ITEs. The device may require less though.
 	 */
-	nr_ites = max(IRQS_PER_CHUNK, roundup_pow_of_two(nvecs));
+	nr_ites = max(2UL, roundup_pow_of_two(nvecs));
 	sz = nr_ites * its->ite_size;
 	sz = max(sz, ITS_ITT_ALIGN) + ITS_ITT_ALIGN - 1;
 	itt = kzalloc(sz, GFP_KERNEL);
 	if (alloc_lpis) {
-		lpi_map = its_lpi_alloc_chunks(nvecs, &lpi_base, &nr_lpis);
+		lpi_map = its_lpi_alloc_chunks(roundup_pow_of_two(nvecs),
+			&lpi_base, &nr_lpis);
 		if (lpi_map)
 			col_map = kzalloc(sizeof(*col_map) * nr_lpis,
 					  GFP_KERNEL);
----------------------------------------------------
Regards,
Lei Zhang
---
Lei Zhang  e-mail: zhang.lei@jp.fujitsu.com FUJITSU LIMITED

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 5416f2b..e68fca6 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -1405,82 +1405,122 @@ static int its_irq_set_vcpu_affinity(struct irq_data *d, void *vcpu_info)
 	.irq_set_vcpu_affinity	= its_irq_set_vcpu_affinity,
 };
 
-/*
- * How we allocate LPIs:
- *
- * The GIC has id_bits bits for interrupt identifiers. From there, we
- * must subtract 8192 which are reserved for SGIs/PPIs/SPIs. Then, as
- * we allocate LPIs by chunks of 32, we can shift the whole thing by 5
- * bits to the right.
- *
- * This gives us (((1UL << id_bits) - 8192) >> 5) possible allocations.
- */
-#define IRQS_PER_CHUNK_SHIFT	5
-#define IRQS_PER_CHUNK		(1UL << IRQS_PER_CHUNK_SHIFT)
-#define ITS_MAX_LPI_NRBITS	16 /* 64K LPIs */
+static struct list_head lpi_free_list;
+static struct list_head lpi_alloc_list; struct lpi_mng {
+	struct list_head lpi_list;
+	int base;
+	int len;
+};
 
-static unsigned long *lpi_bitmap;
-static u32 lpi_chunks;
+#define ITS_MAX_LPI_NRBITS	16 /* 64K LPIs */
 static DEFINE_SPINLOCK(lpi_lock);
 
-static int its_lpi_to_chunk(int lpi)
-{
-	return (lpi - 8192) >> IRQS_PER_CHUNK_SHIFT;
-}
-
-static int its_chunk_to_lpi(int chunk)
-{
-	return (chunk << IRQS_PER_CHUNK_SHIFT) + 8192;
-}
 
 static int __init its_lpi_init(u32 id_bits)  {
-	lpi_chunks = its_lpi_to_chunk(1UL << id_bits);
+	u32 nr_irq = 1UL << id_bits;
+	struct lpi_mng *lpi_free_mng = NULL;
+	struct lpi_mng *lpi_new = NULL;
+
+	INIT_LIST_HEAD(&lpi_free_list);
+	INIT_LIST_HEAD(&lpi_alloc_list);
 
-	lpi_bitmap = kzalloc(BITS_TO_LONGS(lpi_chunks) * sizeof(long),
-			     GFP_KERNEL);
-	if (!lpi_bitmap) {
-		lpi_chunks = 0;
+	lpi_free_mng = kzalloc(sizeof(struct lpi_mng), GFP_KERNEL);
+	if (!lpi_free_mng)
 		return -ENOMEM;
-	}
 
-	pr_info("ITS: Allocated %d chunks for LPIs\n", (int)lpi_chunks);
+	lpi_free_mng->base = 0;
+	lpi_free_mng->len = nr_irq;
+	list_add(&lpi_free_mng->lpi_list, &lpi_free_list);
+
+	do {
+		lpi_free_mng = list_first_entry(&lpi_free_list, struct lpi_mng,
+			lpi_list);
+		if (lpi_free_mng->len == 8192) {
+			/*It is not lpi, so we delete */
+			if (lpi_free_mng->base == 0) {
+				list_del_init(&lpi_free_mng->lpi_list);
+				kfree(lpi_free_mng);
+				continue;
+			}
+			if (lpi_free_mng->base == 8192)
+				goto out;
+		}
+		if (lpi_free_mng->len > 8192) {
+			lpi_new  = kzalloc(sizeof(struct lpi_mng),
+					 GFP_ATOMIC);
+			if (!lpi_new)
+				return -ENOMEM;
+			lpi_free_mng->len /= 2;
+			lpi_new->base = lpi_free_mng->base + lpi_free_mng->len;
+			lpi_new->len = lpi_free_mng->len;
+			list_add(&lpi_new->lpi_list, &lpi_free_mng->lpi_list);
+		}
+	} while (1);
+
+out:
+	pr_info("ITS: Allocated %d  LPIs\n", nr_irq - 8192);
 	return 0;
 }
 
+static struct lpi_mng *its_alloc_lpi(int nr_irqs) {
+	struct lpi_mng *lpi_alloc_mng = NULL;
+	struct lpi_mng *lpi_split = NULL;
+	struct lpi_mng *lpi_new = NULL;
+	int base;
+
+	base = INT_MAX;
+	do {
+		list_for_each_entry(lpi_alloc_mng, &lpi_free_list, lpi_list) {
+			if (nr_irqs > lpi_alloc_mng->len)
+				continue;
+			if (nr_irqs == lpi_alloc_mng->len) {
+				list_del_init(&lpi_alloc_mng->lpi_list);
+				list_add(&lpi_alloc_mng->lpi_list,
+					&lpi_alloc_list);
+				return lpi_alloc_mng;
+			}
+			if ((nr_irqs < lpi_alloc_mng->len)
+				&& (lpi_alloc_mng->base < base)) {
+				base = lpi_alloc_mng->base;
+				lpi_split = lpi_alloc_mng;
+			}
+		}
+		lpi_new  = kzalloc(sizeof(struct lpi_mng),
+				 GFP_ATOMIC);
+		if (!lpi_new || !lpi_split)
+			return NULL;
+
+		lpi_split->len /= 2;
+		lpi_new->base = lpi_split->base + lpi_split->len;
+		lpi_new->len = lpi_split->len;
+		list_add(&lpi_new->lpi_list, &lpi_split->lpi_list);
+
+	} while (1);
+}
+
 static unsigned long *its_lpi_alloc_chunks(int nr_irqs, int *base, int *nr_ids)  {
 	unsigned long *bitmap = NULL;
-	int chunk_id;
-	int nr_chunks;
-	int i;
-
-	nr_chunks = DIV_ROUND_UP(nr_irqs, IRQS_PER_CHUNK);
+	struct lpi_mng *lpi_alloc_mng = NULL;
 
 	spin_lock(&lpi_lock);
 
-	do {
-		chunk_id = bitmap_find_next_zero_area(lpi_bitmap, lpi_chunks,
-						      0, nr_chunks, 0);
-		if (chunk_id < lpi_chunks)
-			break;
-
-		nr_chunks--;
-	} while (nr_chunks > 0);
+	lpi_alloc_mng = its_alloc_lpi(nr_irqs);
 
-	if (!nr_chunks)
+	if (!lpi_alloc_mng)
 		goto out;
 
-	bitmap = kzalloc(BITS_TO_LONGS(nr_chunks * IRQS_PER_CHUNK) * sizeof (long),
-			 GFP_ATOMIC);
+	bitmap = kzalloc(BITS_TO_LONGS(nr_irqs) * sizeof(long),
+		 GFP_ATOMIC);
 	if (!bitmap)
 		goto out;
 
-	for (i = 0; i < nr_chunks; i++)
-		set_bit(chunk_id + i, lpi_bitmap);
 
-	*base = its_chunk_to_lpi(chunk_id);
-	*nr_ids = nr_chunks * IRQS_PER_CHUNK;
+	*base = lpi_alloc_mng->base;
+	*nr_ids = lpi_alloc_mng->len;
 
 out:
 	spin_unlock(&lpi_lock);
@@ -1491,23 +1531,53 @@ static unsigned long *its_lpi_alloc_chunks(int nr_irqs, int *base, int *nr_ids)
 	return bitmap;
 }