From patchwork Thu Oct 12 12:17:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yicong Yang X-Patchwork-Id: 13419206 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6BD51CDB482 for ; Thu, 12 Oct 2023 12:20:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:Message-ID:Date:Subject:CC :To:From:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References: List-Owner; bh=wOJrM5SYgOVtVKoYkzaGSzyCaLJPfSrqJ6vHVgti89U=; b=TollQIKsKb2l5N XapR8dXVZ37Zjw981wkynuJF93b170SaipyIsJz6BZnYbvw74+ysscrABS9TJXjq2hmZHODE1wPty R8Ulr/kOq6Mwm9lXyV/KaRSGm4HeRrwkIvDHoqsNaE7Mx/I7biQ0M7yipSqbarnfZvSaBm2lWXEt1 nAg4HTz2jNac75QFVqNkMNZ8z9kj24np/ALEj4Fg88qAYgsaAVd5xQHkUTuB4MA3c5Dcbj0UNq06D EW9i11pFm0Casi3NoroDo4zKGRKD+YYTtvESdRyvdGktc9+QApYBIKe8a4MViVfof1inWalP6iYS6 AVsC9WoxbGd+WWAVLJ/Q==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qqufq-000x5X-34; Thu, 12 Oct 2023 12:20:18 +0000 Received: from szxga08-in.huawei.com ([45.249.212.255]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1qqufj-000wzx-0V for linux-arm-kernel@lists.infradead.org; Thu, 12 Oct 2023 12:20:13 +0000 Received: from canpemm500009.china.huawei.com (unknown [172.30.72.56]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4S5pYG59YBz1M9NZ; Thu, 12 Oct 2023 20:17:22 +0800 (CST) Received: from localhost.localdomain (10.50.163.32) by canpemm500009.china.huawei.com (7.192.105.203) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.31; Thu, 12 Oct 2023 20:19:57 +0800 From: Yicong Yang To: , , , , , , , , , , , CC: , , , , , , , , , <21cnbao@gmail.com>, , Subject: [PATCH v10 0/3] sched/fair: Scan cluster before scanning LLC in wake-up path Date: Thu, 12 Oct 2023 20:17:04 +0800 Message-ID: <20231012121707.51368-1-yangyicong@huawei.com> X-Mailer: git-send-email 2.31.0 MIME-Version: 1.0 X-Originating-IP: [10.50.163.32] X-ClientProxiedBy: dggems704-chm.china.huawei.com (10.3.19.181) To canpemm500009.china.huawei.com (7.192.105.203) X-CFilter-Loop: Reflected X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231012_052011_603997_AC455899 X-CRM114-Status: GOOD ( 12.40 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org From: Yicong Yang This is the follow-up work to support cluster scheduler. Previously we have added cluster level in the scheduler for both ARM64[1] and X86[2] to support load balance between clusters to bring more memory bandwidth and decrease cache contention. This patchset, on the other hand, takes care of wake-up path by giving CPUs within the same cluster a try before scanning the whole LLC to benefit those tasks communicating with each other. [1] 778c558f49a2 ("sched: Add cluster scheduler level in core and related Kconfig for ARM64") [2] 66558b730f25 ("sched: Add cluster scheduler level for x86") Change since v9: - Since EEVDF merged in mainline, rebase and test on tip-sched-core - Split a Patch 3/3 to solve the hackbench regression on Jacobsville, for easier review Link: https://lore.kernel.org/lkml/20230719092838.2302-1-yangyicong@huawei.com/ Change since v8: - Peter find cpus_share_lowest_cache() is weired so fallback to cpus_share_resources() suggested in v4 - Use sd->groups->flags to find the cluster when scanning, save one per-cpu pointer - Fix sched_cluster_active enabled incorrectly on domain degeneration - Use sched_cluster_active to avoid repeated check on non-cluster machines, per Gautham Link: https://lore.kernel.org/all/20230530070253.33306-1-yangyicong@huawei.com/ Change since v7: - Optimize by choosing prev_cpu/recent_used_cpu when possible after failed to scanning for an idle CPU in cluster/LLC. Thanks Chen Yu for testing on Jacobsville Link: https://lore.kernel.org/all/20220915073423.25535-1-yangyicong@huawei.com/ Change for RESEND: - Collect tag from Chen Yu and rebase on the latest tip/sched/core. Thanks. Link: https://lore.kernel.org/lkml/20220822073610.27205-1-yangyicong@huawei.com/ Change since v6: - rebase on 6.0-rc1 Link: https://lore.kernel.org/lkml/20220726074758.46686-1-yangyicong@huawei.com/ Change since v5: - Improve patch 2 according to Peter's suggestion: - use sched_cluster_active to indicate whether cluster is active - consider SMT case and use wrap iteration when scanning cluster - Add Vincent's tag Thanks. Link: https://lore.kernel.org/lkml/20220720081150.22167-1-yangyicong@hisilicon.com/ Change since v4: - rename cpus_share_resources to cpus_share_lowest_cache to be more informative, per Tim - return -1 when nr==0 in scan_cluster(), per Abel Thanks! Link: https://lore.kernel.org/lkml/20220609120622.47724-1-yangyicong@hisilicon.com/ Change since v3: - fix compile error when !CONFIG_SCHED_CLUSTER, reported by lkp test. Link: https://lore.kernel.org/lkml/20220608095758.60504-1-yangyicong@hisilicon.com/ Change since v2: - leverage SIS_PROP to suspend redundant scanning when LLC is overloaded - remove the ping-pong suppression - address the comment from Tim, thanks. Link: https://lore.kernel.org/lkml/20220126080947.4529-1-yangyicong@hisilicon.com/ Change since v1: - regain the performance data based on v5.17-rc1 - rename cpus_share_cluster to cpus_share_resources per Vincent and Gautham, thanks! Link: https://lore.kernel.org/lkml/20211215041149.73171-1-yangyicong@hisilicon.com/ Barry Song (2): sched: Add cpus_share_resources API sched/fair: Scan cluster before scanning LLC in wake-up path Yicong Yang (1): sched/fair: Use candidate prev/recent_used CPU if scanning failed for cluster wakeup include/linux/sched/sd_flags.h | 7 ++++ include/linux/sched/topology.h | 8 ++++- kernel/sched/core.c | 12 +++++++ kernel/sched/fair.c | 60 +++++++++++++++++++++++++++++++--- kernel/sched/sched.h | 2 ++ kernel/sched/topology.c | 25 ++++++++++++++ 6 files changed, 108 insertions(+), 6 deletions(-)