From patchwork Fri May 25 12:45:24 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Zhang, Lei" X-Patchwork-Id: 10427379 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 33F426025B for ; Fri, 25 May 2018 12:55:33 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2096529183 for ; Fri, 25 May 2018 12:55:33 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 159F529194; Fri, 25 May 2018 12:55:33 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI autolearn=ham version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 558BA29183 for ; Fri, 25 May 2018 12:55:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:Message-ID:Date:Subject:To :From:Reply-To:Cc:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References: List-Owner; bh=7GFjEtmqUuM1X1p1xfXBmfYIL8hM/HMBwk8StmV0O+I=; b=eKxA0vf3VwG0Fu C3DWPoTvl/LwVlY0FbMe9Izh7AjPXdLCiRgvq/8OamfXhB0ArRm9SZpn0GZpp/i1M+j3+LcAwEICL j69TuufTLh4Qa7PAs4ZP5UsUeCLmlrjmsfX//HubdH7WlvFcfy6PcU4WV6yerQpV5wONFV3IRklLn GaxDvnTWgSFQgyv8J4jGNNihPdHBbBcQyUD8TMt0u5gmJB3s9FwIh5zFbtSChiwdEUrvveZF3ybO5 X71sMsPzb7m8FzZUyEN0c6UoLS3At6sXofpvYz3lTuY7CSvxAZPiEQhv/xD4z9pgHMDxNiSur0MEO rkv26SARyUo+mBjWKIXQ==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.90_1 #2 (Red Hat Linux)) id 1fMCFd-0006w2-AN; Fri, 25 May 2018 12:55:21 +0000 Received: from casper.infradead.org ([85.118.1.10]) by bombadil.infradead.org with esmtps (Exim 4.90_1 #2 (Red Hat Linux)) id 1fMCC4-0004Uh-DF for linux-arm-kernel@bombadil.infradead.org; Fri, 25 May 2018 12:51:40 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=MIME-Version:Content-Transfer-Encoding: Content-Type:Message-ID:Date:Subject:To:From:Sender:Reply-To:Cc:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=pMgEVMC8ZciVqHSpr3GaV9ZZuSQjdNt1bOA8/IJfET0=; b=lKix43MIs1IsvHndRTT7Gqni00 QYy76/2iG5Oj+b7jlDfPqAE9xjqC2QptM4chumX0TXeBoCPFBeSb3S1k8X21Y2z3OT4HImJUQ/vL0 GroWAuSi5D40PkJ8MVjDcSj6vi5g20iIaW4I229UU+F8qkMzBTwL21cv3FvPMZ4Poh0d8bfVWEpFz AdvsXkZ2stCS2461XvXW2xeLUjQOZM7ngDWYCvA4/FUeKSL9sJeBBviD8gICLcnZUYMkEqIRh+Yy0 2fbuWsAFJKw6rhWli98iirDWTO7Jd575PbTjIdYbd4Lw/TroB8ZXQATtzrTT4nUBWp8INZem6rz/W raXomWVw==; Received: from mgwkm03.jp.fujitsu.com ([202.219.69.170]) by casper.infradead.org with esmtps (Exim 4.90_1 #2 (Red Hat Linux)) id 1fMC6T-0006Fz-Gi for linux-arm-kernel@lists.infradead.org; Fri, 25 May 2018 12:45:56 +0000 Received: from kw-mxauth.gw.nic.fujitsu.com (unknown [192.168.231.132]) by mgwkm03.jp.fujitsu.com with smtp id 3928_07ce_049aa606_51ad_4433_8489_5379fae62144; Fri, 25 May 2018 21:45:28 +0900 Received: from g01jpfmpwkw01.exch.g01.fujitsu.local (g01jpfmpwkw01.exch.g01.fujitsu.local [10.0.193.38]) by kw-mxauth.gw.nic.fujitsu.com (Postfix) with ESMTP id B05F1AC017D for ; Fri, 25 May 2018 21:45:26 +0900 (JST) Received: from G01JPEXCHKW16.g01.fujitsu.local (G01JPEXCHKW16.g01.fujitsu.local [10.0.194.55]) by g01jpfmpwkw01.exch.g01.fujitsu.local (Postfix) with ESMTP id 0D6F76923B7; Fri, 25 May 2018 21:45:26 +0900 (JST) Received: from G01JPEXMBKW03.g01.fujitsu.local ([10.0.194.67]) by g01jpexchkw16 ([10.0.194.55]) with mapi id 14.03.0352.000; Fri, 25 May 2018 21:45:25 +0900 From: "Zhang, Lei" To: "'marc.zyngier@arm.com'" , "'linux-arm-kernel@lists.infradead.org'" Subject: [PATCH v2]irqchip/irq-gic-v3:Avoid a waste of LPI resource Thread-Topic: [PATCH v2]irqchip/irq-gic-v3:Avoid a waste of LPI resource Thread-Index: AdP0JjLJfmmzjANESgCZIXb1//C1uQ== Date: Fri, 25 May 2018 12:45:24 +0000 Message-ID: <8898674D84E3B24BA3A2D289B872026A69F1300F@G01JPEXMBKW03> Accept-Language: ja-JP, en-US Content-Language: ja-JP X-MS-Has-Attach: X-MS-TNEF-Correlator: x-securitypolicycheck: OK by SHieldMailChecker v2.2.3 x-shieldmailcheckerpolicyversion: FJ-ISEC-20140219 x-originating-ip: [10.18.70.198] MIME-Version: 1.0 X-SecurityPolicyCheck-GC: OK by FENCE-Mail X-TM-AS-MML: disable X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20180525_134554_282555_039F71FC X-CRM114-Status: GOOD ( 23.15 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP The current implementation of irq-gic-v3-its driver allocates at least 32 LPIs (interrupt numbers) for each Device ID even if the number of requested LPIs is only one. I think it is a waste for LPI resource. And if we use many devices over ITS, this implementation may cause a shortage of LPI . I have a patch to avoid this problem by changing method of lpis management. For detail, I use free list instead of chunk method to manage lpis. The points of this patch are as follows. Point1:Not always allocates at least 32 LPIs. Round numbers of lpi requested up to nearest power of two. Point2: Guarantee base lpi number is aligned a power of two. For example if you want 2 lpis, you will get 2 lpis, and base lpi number will be aligned by 2. If you want 15 lpis, you will get 16 lpis, and base lpi number will be aligned by 16. Point3: Lpis be allocated as a contiguous range, Signed-off-by: Lei Zhang ------------------------------------------------ +static void its_joint_free_list(struct lpi_mng *free, struct lpi_mng +*alloc) { + free->len = free->len * 2; + if (free->base > alloc->base) + free->base = alloc->base; +} + static void its_lpi_free_chunks(unsigned long *bitmap, int base, int nr_ids) { - int lpi; + struct lpi_mng *lpi_alloc_mng = NULL; + struct lpi_mng *lpi_free_mng = NULL; + bool first_half; + int pair_base; spin_lock(&lpi_lock); - for (lpi = base; lpi < (base + nr_ids); lpi += IRQS_PER_CHUNK) { - int chunk = its_lpi_to_chunk(lpi); - - BUG_ON(chunk > lpi_chunks); - if (test_bit(chunk, lpi_bitmap)) { - clear_bit(chunk, lpi_bitmap); - } else { - pr_err("Bad LPI chunk %d\n", chunk); + list_for_each_entry(lpi_alloc_mng, &lpi_alloc_list, lpi_list) { + if (lpi_alloc_mng->base == base) { + list_del_init(&lpi_alloc_mng->lpi_list); + break; } } + first_half = (lpi_alloc_mng->base % (lpi_alloc_mng->len * 2)) + ? false : true; + if (first_half) + pair_base = lpi_alloc_mng->base + lpi_alloc_mng->len; + else + pair_base = lpi_alloc_mng->base - lpi_alloc_mng->len; + + // found the other half + list_for_each_entry(lpi_free_mng, &lpi_free_list, lpi_list) { + if (lpi_free_mng->base == pair_base) { + its_joint_free_list(lpi_free_mng, lpi_alloc_mng); + kfree(lpi_alloc_mng); + goto out; + } + } + // Not found the other half + list_for_each_entry(lpi_free_mng, &lpi_free_list, lpi_list) { + if (lpi_alloc_mng->base < lpi_free_mng->base) { + list_add_tail(&lpi_alloc_mng->lpi_list, + &lpi_free_mng->lpi_list); + break; + } + } +out: spin_unlock(&lpi_lock); kfree(bitmap); @@ -2117,12 +2187,13 @@ static struct its_device *its_create_device(struct its_node *its, u32 dev_id, * We allocate at least one chunk worth of LPIs bet device, * and thus that many ITEs. The device may require less though. */ - nr_ites = max(IRQS_PER_CHUNK, roundup_pow_of_two(nvecs)); + nr_ites = max(2UL, roundup_pow_of_two(nvecs)); sz = nr_ites * its->ite_size; sz = max(sz, ITS_ITT_ALIGN) + ITS_ITT_ALIGN - 1; itt = kzalloc(sz, GFP_KERNEL); if (alloc_lpis) { - lpi_map = its_lpi_alloc_chunks(nvecs, &lpi_base, &nr_lpis); + lpi_map = its_lpi_alloc_chunks(roundup_pow_of_two(nvecs), + &lpi_base, &nr_lpis); if (lpi_map) col_map = kzalloc(sizeof(*col_map) * nr_lpis, GFP_KERNEL); ---------------------------------------------------- Regards, Lei Zhang --- Lei Zhang e-mail: zhang.lei@jp.fujitsu.com FUJITSU LIMITED diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c index 5416f2b..e68fca6 100644 --- a/drivers/irqchip/irq-gic-v3-its.c +++ b/drivers/irqchip/irq-gic-v3-its.c @@ -1405,82 +1405,122 @@ static int its_irq_set_vcpu_affinity(struct irq_data *d, void *vcpu_info) .irq_set_vcpu_affinity = its_irq_set_vcpu_affinity, }; -/* - * How we allocate LPIs: - * - * The GIC has id_bits bits for interrupt identifiers. From there, we - * must subtract 8192 which are reserved for SGIs/PPIs/SPIs. Then, as - * we allocate LPIs by chunks of 32, we can shift the whole thing by 5 - * bits to the right. - * - * This gives us (((1UL << id_bits) - 8192) >> 5) possible allocations. - */ -#define IRQS_PER_CHUNK_SHIFT 5 -#define IRQS_PER_CHUNK (1UL << IRQS_PER_CHUNK_SHIFT) -#define ITS_MAX_LPI_NRBITS 16 /* 64K LPIs */ +static struct list_head lpi_free_list; +static struct list_head lpi_alloc_list; struct lpi_mng { + struct list_head lpi_list; + int base; + int len; +}; -static unsigned long *lpi_bitmap; -static u32 lpi_chunks; +#define ITS_MAX_LPI_NRBITS 16 /* 64K LPIs */ static DEFINE_SPINLOCK(lpi_lock); -static int its_lpi_to_chunk(int lpi) -{ - return (lpi - 8192) >> IRQS_PER_CHUNK_SHIFT; -} - -static int its_chunk_to_lpi(int chunk) -{ - return (chunk << IRQS_PER_CHUNK_SHIFT) + 8192; -} static int __init its_lpi_init(u32 id_bits) { - lpi_chunks = its_lpi_to_chunk(1UL << id_bits); + u32 nr_irq = 1UL << id_bits; + struct lpi_mng *lpi_free_mng = NULL; + struct lpi_mng *lpi_new = NULL; + + INIT_LIST_HEAD(&lpi_free_list); + INIT_LIST_HEAD(&lpi_alloc_list); - lpi_bitmap = kzalloc(BITS_TO_LONGS(lpi_chunks) * sizeof(long), - GFP_KERNEL); - if (!lpi_bitmap) { - lpi_chunks = 0; + lpi_free_mng = kzalloc(sizeof(struct lpi_mng), GFP_KERNEL); + if (!lpi_free_mng) return -ENOMEM; - } - pr_info("ITS: Allocated %d chunks for LPIs\n", (int)lpi_chunks); + lpi_free_mng->base = 0; + lpi_free_mng->len = nr_irq; + list_add(&lpi_free_mng->lpi_list, &lpi_free_list); + + do { + lpi_free_mng = list_first_entry(&lpi_free_list, struct lpi_mng, + lpi_list); + if (lpi_free_mng->len == 8192) { + /*It is not lpi, so we delete */ + if (lpi_free_mng->base == 0) { + list_del_init(&lpi_free_mng->lpi_list); + kfree(lpi_free_mng); + continue; + } + if (lpi_free_mng->base == 8192) + goto out; + } + if (lpi_free_mng->len > 8192) { + lpi_new = kzalloc(sizeof(struct lpi_mng), + GFP_ATOMIC); + if (!lpi_new) + return -ENOMEM; + lpi_free_mng->len /= 2; + lpi_new->base = lpi_free_mng->base + lpi_free_mng->len; + lpi_new->len = lpi_free_mng->len; + list_add(&lpi_new->lpi_list, &lpi_free_mng->lpi_list); + } + } while (1); + +out: + pr_info("ITS: Allocated %d LPIs\n", nr_irq - 8192); return 0; } +static struct lpi_mng *its_alloc_lpi(int nr_irqs) { + struct lpi_mng *lpi_alloc_mng = NULL; + struct lpi_mng *lpi_split = NULL; + struct lpi_mng *lpi_new = NULL; + int base; + + base = INT_MAX; + do { + list_for_each_entry(lpi_alloc_mng, &lpi_free_list, lpi_list) { + if (nr_irqs > lpi_alloc_mng->len) + continue; + if (nr_irqs == lpi_alloc_mng->len) { + list_del_init(&lpi_alloc_mng->lpi_list); + list_add(&lpi_alloc_mng->lpi_list, + &lpi_alloc_list); + return lpi_alloc_mng; + } + if ((nr_irqs < lpi_alloc_mng->len) + && (lpi_alloc_mng->base < base)) { + base = lpi_alloc_mng->base; + lpi_split = lpi_alloc_mng; + } + } + lpi_new = kzalloc(sizeof(struct lpi_mng), + GFP_ATOMIC); + if (!lpi_new || !lpi_split) + return NULL; + + lpi_split->len /= 2; + lpi_new->base = lpi_split->base + lpi_split->len; + lpi_new->len = lpi_split->len; + list_add(&lpi_new->lpi_list, &lpi_split->lpi_list); + + } while (1); +} + static unsigned long *its_lpi_alloc_chunks(int nr_irqs, int *base, int *nr_ids) { unsigned long *bitmap = NULL; - int chunk_id; - int nr_chunks; - int i; - - nr_chunks = DIV_ROUND_UP(nr_irqs, IRQS_PER_CHUNK); + struct lpi_mng *lpi_alloc_mng = NULL; spin_lock(&lpi_lock); - do { - chunk_id = bitmap_find_next_zero_area(lpi_bitmap, lpi_chunks, - 0, nr_chunks, 0); - if (chunk_id < lpi_chunks) - break; - - nr_chunks--; - } while (nr_chunks > 0); + lpi_alloc_mng = its_alloc_lpi(nr_irqs); - if (!nr_chunks) + if (!lpi_alloc_mng) goto out; - bitmap = kzalloc(BITS_TO_LONGS(nr_chunks * IRQS_PER_CHUNK) * sizeof (long), - GFP_ATOMIC); + bitmap = kzalloc(BITS_TO_LONGS(nr_irqs) * sizeof(long), + GFP_ATOMIC); if (!bitmap) goto out; - for (i = 0; i < nr_chunks; i++) - set_bit(chunk_id + i, lpi_bitmap); - *base = its_chunk_to_lpi(chunk_id); - *nr_ids = nr_chunks * IRQS_PER_CHUNK; + *base = lpi_alloc_mng->base; + *nr_ids = lpi_alloc_mng->len; out: spin_unlock(&lpi_lock); @@ -1491,23 +1531,53 @@ static unsigned long *its_lpi_alloc_chunks(int nr_irqs, int *base, int *nr_ids) return bitmap; }