From patchwork Mon Jan 29 07:34:50 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chintan Pandya X-Patchwork-Id: 10189369 X-Patchwork-Delegate: agross@codeaurora.org Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id B471060375 for ; Mon, 29 Jan 2018 07:34:59 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A427928749 for ; Mon, 29 Jan 2018 07:34:59 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 98139287F4; Mon, 29 Jan 2018 07:34:59 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 75AED28749 for ; Mon, 29 Jan 2018 07:34:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751398AbeA2He5 (ORCPT ); Mon, 29 Jan 2018 02:34:57 -0500 Received: from smtp.codeaurora.org ([198.145.29.96]:54572 "EHLO smtp.codeaurora.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751083AbeA2He4 (ORCPT ); Mon, 29 Jan 2018 02:34:56 -0500 Received: by smtp.codeaurora.org (Postfix, from userid 1000) id 768A5609EF; Mon, 29 Jan 2018 07:34:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1517211295; bh=T39rvpESBA4xfzaAFti6keiCxhaoOg+xKiIP3TiW44c=; h=Subject:To:Cc:From:References:Date:In-Reply-To:From; b=U6k5w7jlQpPK+uz1LTEYjr14kj8oUz1JtbRhX6Urud7jlluw95vYpD5LwEQHC+3so imf4H9JC+ST3ESdR2wDHBzgrZn1+QTQi06i3+SxCyH5YdVIAwIQDN4IwkkjDi4M4b5 O3HasFTWe8sK2wj/I8Dznc6YqINQ+NCXwtuyQ0BE= Received: from [10.204.100.248] (blr-c-bdr-fw-01_globalnat_allzones-outside.qualcomm.com [103.229.19.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: cpandya@smtp.codeaurora.org) by smtp.codeaurora.org (Postfix) with ESMTPSA id 0ED0E60592; Mon, 29 Jan 2018 07:34:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1517211294; bh=T39rvpESBA4xfzaAFti6keiCxhaoOg+xKiIP3TiW44c=; h=Subject:To:Cc:From:References:Date:In-Reply-To:From; b=D0mdLxFyjrdDZPM3eDCgT2U0xPOi7D5Aq9PFN7RN9n+LeZkwu6uENyZN/Xe11KgCU pCOHH2vXSLxxJXIIXEYE2b/ZckYIuZ0/yMW7+nn+B/dju7LQ2RD4MfuBGTOSWlXH8X l6CCL7Mek0TgQeS/RkrTbNPqsP7iQpCNQsargNEY= DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org 0ED0E60592 Authentication-Results: pdx-caf-mail.web.codeaurora.org; dmarc=none (p=none dis=none) header.from=codeaurora.org Authentication-Results: pdx-caf-mail.web.codeaurora.org; spf=none smtp.mailfrom=cpandya@codeaurora.org Subject: Re: [PATCH v2] of: use hash based search in of_find_node_by_phandle To: Rob Herring Cc: Rasmus Villemoes , Frank Rowand , "open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS" , "linux-kernel@vger.kernel.org" , linux-arm-msm From: Chintan Pandya References: <1516955496-17236-1-git-send-email-cpandya@codeaurora.org> <20180126213943.7zwijanlbrq3y666@rob-hp-laptop> Message-ID: Date: Mon, 29 Jan 2018 13:04:50 +0530 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.5.2 MIME-Version: 1.0 In-Reply-To: <20180126213943.7zwijanlbrq3y666@rob-hp-laptop> Content-Language: en-US Sender: linux-arm-msm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-arm-msm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP > I was curious, so I implemented it. It ends up being similar to Rasmus's > 1st suggestion. The difference is we don't try to store all entries, but > rather implement a hash table that doesn't handle collisions. Relying on > the fact that phandles are just linearly allocated from 0, we just mask > the high bits of the phandle to get the index. I think this is most resourceful way. > Can you try out on your setup and try different > array sizes. Here are my test results. However, I simply considered overall boot time to compare different scenarios because profiling of_find_node_by_phandle() in early boot fails. Scenarios: [1] Cache size 1024 + early cache build up [Small change in your cache patch,     see the patch below] [2] Hash 64 approach[my original v2 patch] [3] Cache size 64 [4] Cache size 128 [5] Cache size 256 [6] Base build Result (boot to shell in sec): [1] 14.292498 14.370994 14.313537 --> 850ms avg gain [2] 14.340981 14.395900 14.398149 --> 800ms avg gain [3] 14.546429 14.488783 14.468694 --> 680ms avg gain [4] 14.506007 14.497487 14.523062 --> 670ms avg gain [5] 14.671100 14.643344 14.731853 --> 500ms avg gain [6] 15.263386 15.246155 15.041847 --> 0 Previously reported 400ms gain for [2] was from different set up. These tests and new data is from my own debug set up. When we take any of these patch to production, result might deviate accordingly. Chintan Pandya Patch for the case [1] <---------------------------------------------------------------------------> diff --git a/drivers/of/base.c b/drivers/of/base.c index a0bccb5..671e7cf 100644 --- a/drivers/of/base.c +++ b/drivers/of/base.c @@ -1084,6 +1084,8 @@ int of_modalias_node(struct device_node *node, char *modalias, int len)  }  EXPORT_SYMBOL_GPL(of_modalias_node); +struct device_node *phandle_cache[PHANDLE_CACHE_SZ]; +  /**   * of_find_node_by_phandle - Find a node given a phandle   * @handle:    phandle of the node to find @@ -1100,9 +1102,15 @@ struct device_node *of_find_node_by_phandle(phandle handle)                 return NULL;         raw_spin_lock_irqsave(&devtree_lock, flags); -       for_each_of_allnodes(np) -               if (np->phandle == handle) -                       break; +       np = phandle_cache[PHANDLE_CACHE_HASH(handle)]; +       if (!np || np->phandle != handle) { +               for_each_of_allnodes(np) +                       if (np->phandle == handle) { + phandle_cache[PHANDLE_CACHE_HASH(handle)] = np; +                               break; +                       } + +       }         of_node_get(np);         raw_spin_unlock_irqrestore(&devtree_lock, flags);         return np; diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c index c0914fb..e90a458 100644 --- a/drivers/of/fdt.c +++ b/drivers/of/fdt.c @@ -227,6 +227,9 @@ static void populate_properties(const void *blob,                 pprev      = &pp->next;         } +       if (!dryrun && np->phandle) +               phandle_cache[PHANDLE_CACHE_HASH(np->phandle)] = np; +         /* With version 0x10 we may not have the name property,          * recreate it here from the unit name if absent          */ diff --git a/include/linux/of.h b/include/linux/of.h index 299aeb1..b1200be 100644 --- a/include/linux/of.h +++ b/include/linux/of.h @@ -92,6 +92,10 @@ struct of_phandle_iterator {         struct device_node *node;  }; +#define PHANDLE_CACHE_SZ       1024 +#define PHANDLE_CACHE_HASH(x)  ((x) & (PHANDLE_CACHE_SZ - 1)) +extern struct device_node *phandle_cache[PHANDLE_CACHE_SZ]; +  struct of_reconfig_data {         struct device_node      *dn;         struct property         *prop;