From patchwork Wed Mar 1 12:19:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Raghavendra K T X-Patchwork-Id: 13155895 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D9F77C64EC7 for ; Wed, 1 Mar 2023 12:20:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 747706B0073; Wed, 1 Mar 2023 07:20:28 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6F6D76B0074; Wed, 1 Mar 2023 07:20:28 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 597A56B007B; Wed, 1 Mar 2023 07:20:28 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 4BFA76B0073 for ; Wed, 1 Mar 2023 07:20:28 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 1D909140799 for ; Wed, 1 Mar 2023 12:20:28 +0000 (UTC) X-FDA: 80520237336.28.37608F4 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12on2062.outbound.protection.outlook.com [40.107.243.62]) by imf10.hostedemail.com (Postfix) with ESMTP id 269C3C0021 for ; Wed, 1 Mar 2023 12:20:24 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=none; arc=pass ("microsoft.com:s=arcselector9901:i=1"); dmarc=pass (policy=quarantine) header.from=amd.com; spf=pass (imf10.hostedemail.com: domain of Raghavendra.KodsaraThimmappa@amd.com designates 40.107.243.62 as permitted sender) smtp.mailfrom=Raghavendra.KodsaraThimmappa@amd.com ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1677673225; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tDXLvwqwbuijg3/rtfuFEB9A4LWMIALLerkzU7QFDYQ=; b=TaADVNXz+xVgqS+HVfCC3MhW+qCx1tsH3v0YZAXfd3HtKTsb2SYUfuLl0YgAKdWXAcuFMD jZdB9vfe5v9uqjWBILmPv3dW+4P+ML0/L/kw2S5l70mEqf0t0mriSYVXld2L0ENGUstz75 VjThXJ2h071AJlq3ukZXLW9V5Mg/aQ0= ARC-Authentication-Results: i=2; imf10.hostedemail.com; dkim=none; arc=pass ("microsoft.com:s=arcselector9901:i=1"); dmarc=pass (policy=quarantine) header.from=amd.com; spf=pass (imf10.hostedemail.com: domain of Raghavendra.KodsaraThimmappa@amd.com designates 40.107.243.62 as permitted sender) smtp.mailfrom=Raghavendra.KodsaraThimmappa@amd.com ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1677673225; a=rsa-sha256; cv=pass; b=1zTDTmZddljZWKSh2m043BaTBoVRVs/G3PFSISfcBdwYhpDVCQ3LiYtg3RZZvwtzy2UsYj h5Shjx7LkbaNipNxilhQAvNrcK5Gw0Wwhnk2yL6wq0PYcEpVYQwe5mYdomrXlAe9QmW799 PUB46mQvcBAllyFgbFztycJZzdiHp+w= ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=W4RYe5igQgDx6XoYSk8vvpQnT+8q7nILPJlTyjK4L+GCQE0Z4LlD0azRWYVEXO+rA0olLi1tmF3m1Gz/YM72DFGbdWxdkQy1zQrjDGt/t2bezpPRp92orT8B6lSfVKZinjwbifHOrtKNbdccUFgdwetVhRHmhC48A4s4MGPv3KGZDwrACr5yuRJ0H1ZQwsq9fQi7KSil+fO5z1sZkY7ayrFJCuZk5RwHKGLYXIJAmUv9rN3diIEQdfqizB5/aXLQ/Kz5zs3D3lsnzsKUEilwheD/zE4MV8kayqHgk8WgkV5NZkH1WBZtrALuxLxO3vi7ZW4rDn8Dc6XY/fVvNnXI1A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=tDXLvwqwbuijg3/rtfuFEB9A4LWMIALLerkzU7QFDYQ=; b=i90zdrBaRfCBGPWdhPv5vU9U0jKSY1tPL6ll3qyu3f6nwS0IHR7Elrxa5GCnaAduMuf60dx3RDm1QnDpc4q0p4DvGLi6/S2CxCKDhxjGeMx7028VDkepNZPBNC2CMWYDX3K8zB5oE0sv2K+eQ7UR4Xb4qKgYqDzyYzjFuwvWAZeR15obO82/DvsrLl4VfQviI6C26gr0QNJTw7oTGF2e6njSLF6H+QtI7qhg4eRQoQE6WJObeUf58vpo/TVdr2bNhWzzaZJzlO3A8Y2sPvPX4N+tVEEHL/XtnPv0aeqR+k3yy5SdZZIdQeEvPtni3AwyhxNYI3he8/yItZb8/s4ZSQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none Received: from DS7PR05CA0034.namprd05.prod.outlook.com (2603:10b6:8:2f::35) by IA0PR12MB8713.namprd12.prod.outlook.com (2603:10b6:208:48e::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6156.18; Wed, 1 Mar 2023 12:20:22 +0000 Received: from DS1PEPF0000E639.namprd02.prod.outlook.com (2603:10b6:8:2f:cafe::61) by DS7PR05CA0034.outlook.office365.com (2603:10b6:8:2f::35) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6156.18 via Frontend Transport; Wed, 1 Mar 2023 12:20:21 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by DS1PEPF0000E639.mail.protection.outlook.com (10.167.17.71) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.6156.12 via Frontend Transport; Wed, 1 Mar 2023 12:20:21 +0000 Received: from BLR-L-RKODSARA.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.34; Wed, 1 Mar 2023 06:20:17 -0600 From: Raghavendra K T To: , CC: Ingo Molnar , Peter Zijlstra , "Mel Gorman" , Andrew Morton , "David Hildenbrand" , , Bharata B Rao , Disha Talreja , Raghavendra K T Subject: [PATCH REBASE V3 2/4] sched/numa: Enhance vma scanning logic Date: Wed, 1 Mar 2023 17:49:01 +0530 Message-ID: <092f03105c7c1d3450f4636b1ea350407f07640e.1677672277.git.raghavendra.kt@amd.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Originating-IP: [10.180.168.240] X-ClientProxiedBy: SATLEXMB03.amd.com (10.181.40.144) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS1PEPF0000E639:EE_|IA0PR12MB8713:EE_ X-MS-Office365-Filtering-Correlation-Id: 712bc024-5db1-408b-d395-08db1a4f541a X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: jXySr1HB2yOBeR3bu2An4Dyr08vgaUZ1V+3NDTwfz722UW6rd5AMRXtFJhKIBVnBhAFvtd44ABYIuTYag4YuU5YdrHahrhn8VfvqfMsRd7Uh2dVYux9vTBLPRUo/GGYAjUjtLKbaKWM2qmYhlA5QRekpX/qwMyr1fBzu3kAjOg9Z5kNITWhyZ0BKtoV5/3GHzLih2MLU9RMLj/hf2I16aqjPl7vVZODsqMGV+5w1ggkO8/nOqg+WBxv+LEXmGLVSz3oKlnL1nPIH3MMZlSp7KIPSG6+aCG7nkQEWqRar5IpjnQ3+TaY7INUgrJBXum2vwy5W7nBL5xOczKXUiAuDXjGaq6Xs14IO29fTKP5cRgjVS7CZH30E6iqzwe3+q8/i1PhohotB/OYcCHLkZZ8LcVkFGcSXHYF7VwBbo2fpHdr2bQOmuFNkEPz8k6yluFj4rN+eKnxVs/QpLEsqpsGsQ4S0kMvTyXCfFaSO5eD+PjQaNF7eOkWe3oy1Ez1HNOT9BwRpKRdQLhi1s+OMPQwM/+FjZfQH9dcsJ9o+ZPfkRGMtU0newVIWS78Sz1WL5rRIAraT8ulEcnaWbJulShb2Hirjzp6WWnYY6w2OzqOitTsKrzg5vki6GiSY36T1ZGe0dYzhbF7TknY2mgyTKWb924SesFSePpjuOM3sdbBl4FSZ8OOyNxcBq4RM5cboDBQpVJBng9P5pl2LLY/O8zXyMNYixGONZQh8B2CQuvnMKRwsFcrbfDLdCNGVEeDNOfvg X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230025)(4636009)(346002)(39860400002)(136003)(376002)(396003)(451199018)(40470700004)(36840700001)(46966006)(36756003)(41300700001)(40480700001)(5660300002)(8936002)(4326008)(8676002)(2906002)(356005)(36860700001)(82740400003)(81166007)(6666004)(7696005)(478600001)(316002)(54906003)(70586007)(82310400005)(70206006)(83380400001)(110136005)(426003)(47076005)(2616005)(40460700003)(186003)(336012)(16526019)(26005)(2101003)(36900700001);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 01 Mar 2023 12:20:21.6183 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 712bc024-5db1-408b-d395-08db1a4f541a X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: DS1PEPF0000E639.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA0PR12MB8713 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 269C3C0021 X-Stat-Signature: gcgcgathdjxot5ctt6rduh8w6fa5murg X-Rspam-User: X-HE-Tag: 1677673224-687487 X-HE-Meta: U2FsdGVkX18CmOqoa/jRDkf/AFnYn2BRvbUkTGIz7Nkmn//DLXHUCCAeaH0YiXNnwdNig/yEVmsMF4kpMJAMo+8Eo6mOxtJWnDLk+n26INtj8IoJ1AkToJ3OlsC4SnfGmkjxQwlnv/yIogSZerrl8KEw75JJ5oCzbLug6iot7bVNxkCgQ3RImT4f1O9IEQCY8RSg0TLbgN8C0z4iBwmqyTi53LvZHGIPd4oTvqK0/0ykU/V46mNDiX5+p+duvoZbXzJH6gCA7LiWWMona91QqHSL1vxfkLsbnWkv6iUA2FaEd45MaY/cpzmr4UY2CwjC9vUqJ55N/a2o6uPDy6ZuYF+gIJ4Qw+5bluoxdjrwVuxxqyTzW7OKBA4aCpLAnXY+WZFBItj1gBB0ubZDyaq+lMKxAFuHmsGFxHa7yTxIWVYpXdsatnozFQC/wYFZDA3jSEsbMDm5jTb2o79IMLpyBH3IvuNqUxWT9qfSgzJ8tiv+i0IKaJPzKBPOyi3Td7F9ZED5NZsP3wlrO00wRISxSbJbQ6s7IF5DFfMwnev5BbPYtoWWmBpHv0U3MQqueif6MaadkLMdrVd0G9z1swbg9gPEo20PapUcrluX4GmFbg/D6Z3PLXN6WxShLWuhV6DtvOtzA6Az1IJqTHvzsT/YvIps1NLjxApZTDyHx5yhkhZXBLU1eJ3sUo/KMXT2sIJI+acImrfWZ8kDQG4XiQfgIsHBMtsz7bUdoNrdtW6LcawwUzVs+PAmXlM3PHq9X9xDAO3eqfg/s6/rAK00CBjOm1o9qbFztXECDbyf+taH9vxXounAfvSAZvec3smMuYvg5IuTL9SpyfWRP8606P5PmhNcN1WyTOzXWWMlpTNsFJw0KfCuLOben8/dx4tyhnQW5z2jhVUQMi1fsAle5pVYUioLUspr6cpcot5E7893Sp0+4wm+rR9C3qRDMAxGwONB482Fn0oh9jbLyissbBl g1NXrCvE tObsuiF2DktWUiYXM4ZNPHK2abYgV2e54OSGIuEZ0Y+amXESc+MwgNY9bsdHnivVBRmyDuNukY7a2b9bNuIysd9Gtowd8+uypJwa5h1VmJHsRkaT7fcNVo3VCz85dogvniuJSrWmx3pioEwf+tHif2n3Sap16Yp2ZpkGvOlO26CzoY66giqGlFMKUBRxVx9EEqJjE21p66Sbg7FyyG9ZA8JnnBgEsJDyct6Js3Lnm0+XyBXV99NbhrXzAoI7KBS08zSNyb4bAmKU9WI2ac/ee1SMsYToIRyID4BfzLr5P2Bmv4u2/3dHFRM/hCwahGtlTGvu3UUU6KgI5J3g0nyrruDV+v5l7fWG9xqDySYyHjyb2ICbpNmw5hJ+WXbNjugdpAqB0VTXCdoBbaJnBfZCSMDbkofb0JCk18dDxAmy6UkruEM72JqEKRcM9bs8Eq5euWDwhbmpHQd5EIWHXcxM8gbH4nMbjjXTSjKsP7SG2DqVZmOrRBjP41dHHSsRwVKtFZdmhNhNkHm73NyRt7oX7bt7V4aTX/3ENDq3ASmgEjx/HJulCesrRfY0dog== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: During the Numa scanning make sure only relevant vmas of the tasks are scanned. Before: All the tasks of a process participate in scanning the vma even if they do not access vma in it's lifespan. Now: Except cases of first few unconditional scans, if a process do not touch vma (exluding false positive cases of PID collisions) tasks no longer scan all vma Logic used: 1) 6 bits of PID used to mark active bit in vma numab status during fault to remember PIDs accessing vma. (Thanks Mel) 2) Subsequently in scan path, vma scanning is skipped if current PID had not accessed vma. 3) First two times we do allow unconditional scan to preserve earlier behaviour of scanning. Acknowledgement to Bharata B Rao for initial patch to store pid information and Peter Zijlstra (Usage of test and set bit) Suggested-by: Mel Gorman Signed-off-by: Raghavendra K T --- include/linux/mm.h | 14 ++++++++++++++ include/linux/mm_types.h | 1 + kernel/sched/fair.c | 19 +++++++++++++++++++ mm/memory.c | 3 +++ 4 files changed, 37 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index 2cce434a5e55..b7e4484af05b 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1666,6 +1666,16 @@ static inline int xchg_page_access_time(struct page *page, int time) last_time = page_cpupid_xchg_last(page, time >> PAGE_ACCESS_TIME_BUCKETS); return last_time << PAGE_ACCESS_TIME_BUCKETS; } + +static inline void vma_set_access_pid_bit(struct vm_area_struct *vma) +{ + unsigned int pid_bit; + + pid_bit = current->pid % BITS_PER_LONG; + if (vma->numab_state && !test_bit(pid_bit, &vma->numab_state->access_pids)) { + __set_bit(pid_bit, &vma->numab_state->access_pids); + } +} #else /* !CONFIG_NUMA_BALANCING */ static inline int page_cpupid_xchg_last(struct page *page, int cpupid) { @@ -1715,6 +1725,10 @@ static inline bool cpupid_match_pid(struct task_struct *task, int cpupid) { return false; } + +static inline void vma_set_access_pid_bit(struct vm_area_struct *vma) +{ +} #endif /* CONFIG_NUMA_BALANCING */ #if defined(CONFIG_KASAN_SW_TAGS) || defined(CONFIG_KASAN_HW_TAGS) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 1cea78f60011..df4e0bc66d17 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -477,6 +477,7 @@ struct vma_lock { struct vma_numab_state { unsigned long next_scan; + unsigned long access_pids; }; /* diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 7c2bbc8d618b..9443ae9db028 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -2928,6 +2928,21 @@ static void reset_ptenuma_scan(struct task_struct *p) p->mm->numa_scan_offset = 0; } +static bool vma_is_accessed(struct vm_area_struct *vma) +{ + /* + * Allow unconditional access first two times, so that all the (pages) + * of VMAs get prot_none fault introduced irrespective of accesses. + * This is also done to avoid any side effect of task scanning + * amplifying the unfairness of disjoint set of VMAs' access. + */ + if (READ_ONCE(current->mm->numa_scan_seq) < 2) + return true; + + return test_bit(current->pid % BITS_PER_LONG, + &vma->numab_state->access_pids); +} + /* * The expensive part of numa migration is done from task_work context. * Triggered from task_tick_numa(). @@ -3046,6 +3061,10 @@ static void task_numa_work(struct callback_head *work) vma->numab_state->next_scan)) continue; + /* Do not scan the VMA if task has not accessed */ + if (!vma_is_accessed(vma)) + continue; + do { start = max(start, vma->vm_start); end = ALIGN(start + (pages << PAGE_SHIFT), HPAGE_SIZE); diff --git a/mm/memory.c b/mm/memory.c index 255b2f4fdd4a..8fac837cde9e 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4647,6 +4647,9 @@ int numa_migrate_prep(struct page *page, struct vm_area_struct *vma, { get_page(page); + /* Record the current PID acceesing VMA */ + vma_set_access_pid_bit(vma); + count_vm_numa_event(NUMA_HINT_FAULTS); if (page_nid == numa_node_id()) { count_vm_numa_event(NUMA_HINT_FAULTS_LOCAL);