From patchwork Wed May 24 06:04:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 13253279 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2C88EC77B7A for ; Wed, 24 May 2023 06:05:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 95879900003; Wed, 24 May 2023 02:05:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8E254900002; Wed, 24 May 2023 02:05:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 75A97900003; Wed, 24 May 2023 02:05:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 61432900002 for ; Wed, 24 May 2023 02:05:10 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 2BE081408A5 for ; Wed, 24 May 2023 06:05:10 +0000 (UTC) X-FDA: 80824110780.06.A5CD686 Received: from mail-pf1-f173.google.com (mail-pf1-f173.google.com [209.85.210.173]) by imf09.hostedemail.com (Postfix) with ESMTP id 5FFC0140005 for ; Wed, 24 May 2023 06:05:08 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b="JH7/tN5i"; spf=pass (imf09.hostedemail.com: domain of npiggin@gmail.com designates 209.85.210.173 as permitted sender) smtp.mailfrom=npiggin@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1684908308; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=h76hRggjcOPyKeT4qZCjFVlxoGbcWZbH0agh7LoEb/E=; b=cVNWUtntS6pPnefeKyhj+z4cLh0TF8sLhv9Atz2J44JyQXGb5yVIUD4FCRTjRq/d/4BzXQ jNtoLYN14fIe3ixx2DIokz7R11g6MTDi8Ct+a6H78yosNU0wca+SGJAOtngMX5U9dXnR3/ flf1BjHrkmAcSHWpZm37An9V9TT6t0A= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1684908308; a=rsa-sha256; cv=none; b=nlz1/+WuddllusA9wZ4gH7GmxCe2+Ge4zR4L7QJTG+OidVxMjHwJrCtt5EZiWo7gy4IAOA NOpK/5BzZY2Sd2kUw/SjitsRUrsHRgz7U3d3m8Br25N0YOUCcaLk245Q2WxJz2ILPR2Qb6 h1Pfjxbli8J+OX7UNnhtq3tmxmO17vE= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b="JH7/tN5i"; spf=pass (imf09.hostedemail.com: domain of npiggin@gmail.com designates 209.85.210.173 as permitted sender) smtp.mailfrom=npiggin@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-pf1-f173.google.com with SMTP id d2e1a72fcca58-64d5f65a2f7so351636b3a.1 for ; Tue, 23 May 2023 23:05:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684908307; x=1687500307; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=h76hRggjcOPyKeT4qZCjFVlxoGbcWZbH0agh7LoEb/E=; b=JH7/tN5i/taJtcD8ZS9eOTiUsqc6vwgZ46sg9LHPOr24Zz9ChJB9x+qCYyBvQRyVVY Mn/tE2JQvyTwNt2CAyr3fez7Hvmz3nrPPObxHmhJHy0NvrqHvXfnvBLXy1SzIGQAP+wq vPHwLxZknPrWwx68HVc94NrbyewdTYS8bI9qnvJYSxkNXAWFkZmGw9YQNxDOxpG6tj+Z 3VSwVwlfI4Tl6nGMN3W/5Ii0UgFCoOMDGLMgx9v0UX2bJu5Njp+p9fdcp9MMK1NkIZ+6 fnFJwImEtJb3jsaiRuBHltbDJRxZIdkiHMaI4+K+U8YJ5VWLypsYSM96y4sijvg1iz6h wGNQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684908307; x=1687500307; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=h76hRggjcOPyKeT4qZCjFVlxoGbcWZbH0agh7LoEb/E=; b=aqZ92CRg8F4GRj0PmD/oceZXRIgr+aGBF2FLt6DMZGaEDoHgPafEOwfF0HQB+0uUs9 bxfWB/KinC50gan6dKti07PnjLBmubH2aJVsb+saSezS6/v8p2dttl7O0tapPNnb0ugw OE5iglEwcwcJpmnyK2OGVoyAvOfZUh5ZYOgad880HsNLwdYzo/+yKekCSpwyOjxJk0Mf hq1l5IEyF5/dtjRXSQTWbQ+QrL5Bsr2lQWBpayO/fs4IgaUv88t7zBv6WHgtz64KMDFM JxulR2pBQuIhygh1VHhanWkGoXf92ztXAdGhaY0Z4C5zMYMoftwR9lFgRN7TwfLsBZ7T 33mw== X-Gm-Message-State: AC+VfDyy3o6vmNdZ1zp6qNs+4rLKxRaNLOvJCGfr8Qj2I+hNFMHI1AYM i7owFsKwgQ7rYFEE8pPV3IU= X-Google-Smtp-Source: ACHHUZ5h8gmTlVFgyzOdv6CRG0hYaebOlspy3KPJgiMnmH+PNp/erWFdKQaV209RguXZ/VoWnFMI/Q== X-Received: by 2002:a05:6a00:194e:b0:644:8172:3ea9 with SMTP id s14-20020a056a00194e00b0064481723ea9mr1898844pfk.15.1684908306855; Tue, 23 May 2023 23:05:06 -0700 (PDT) Received: from wheely.local0.net ([220.240.241.243]) by smtp.gmail.com with ESMTPSA id k25-20020aa792d9000000b00646e7d2b5a7sm6678426pfa.112.2023.05.23.23.05.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 May 2023 23:05:05 -0700 (PDT) From: Nicholas Piggin To: Andrew Morton Cc: Nicholas Piggin , Linus Torvalds , Peter Zijlstra , linuxppc-dev@lists.ozlabs.org, linux-mm@kvack.org Subject: [PATCH 1/2] lazy tlb: fix hotplug exit race with MMU_LAZY_TLB_SHOOTDOWN Date: Wed, 24 May 2023 16:04:54 +1000 Message-Id: <20230524060455.147699-1-npiggin@gmail.com> X-Mailer: git-send-email 2.40.1 MIME-Version: 1.0 X-Rspamd-Queue-Id: 5FFC0140005 X-Rspam-User: X-Stat-Signature: pt8brqay4bcgu1od8f9491bohk17oh3s X-Rspamd-Server: rspam03 X-HE-Tag: 1684908308-739881 X-HE-Meta: U2FsdGVkX19+EZzm2Zyx2j7II74sHypnrTjEo66SKtxIhwgnV1gBo5h3ZItBx/um/TAiKsInE/ZO6+ei3nI034sVVxoxmm+jcpAMuw1UHNX1q/HH7njB6Ibc5e2gT5JlO3/K42TX+WYzpjbK7zv0p1t1TinDq7TCf35gdGGskIKvPYG/xq/omEgiZkj8Y/bpE7r784usSjVWV9hW4Z/QUx28C6vhXAR03xebAlODP+sKvxqHZentdsk0EPPOFprZzm3L9bD6DM8XemMoy8g0QvxF00ch8ykosUizeDEDOzSleoWrTOK8IaMrhRrcT5d+5JI8jpugsGY0Y5tZ8aWmi1qtzqCyen9B8sQ54x5yQNoGgIueVT38p2GRXzplQavgw9dMVCBib6w/7iYmmOZOxHAfeZ5CWej7Hq93VDaRG6Dkh/rJS0g1ZWkutGbSsEmr0CU11DyTrI4y8/T/Np2EQIv5bPsSzma8Vtk4+X8RTjBlDyC3BFJXchWdJFkfPmjzanBWIbGoXEv/wEmTJHD6SZwgmhY7QRfjKAs8t7TJkiSUtuH9kCSoAgPqWoh9fl0nUrWcySa/LeOeg4oaR6C2+fKsSBRH+UccFAruf4TH3B75nUhcJcS+vqOF0VokKQHWf+zy0z9EGxY7b3cfrq4Q1lCK+5cXlr3/0EO/VOFu+YwtZDYnOCnRHUEYiA9SP1ZW40Ds+avJhALX+bsQp6yIHpBLMHZ93M+qmbSEVq3QIW94692JL50ym+3/5zStyyPt63/tcHeQGa2prIkaZSF/GjDno+D5NbQLOnPLa/Ta2tlLAV4Qu3VRh81mBDwuTru3RmDBkKS1uXm1Y4+eHbS1kIe5GSsfpmgbDoV/nVM0a9hgQi+VbHZE9spfkCLQ0/KUapVTxD8VLHgRlqBzRtplM5Kr5W1/bkdn1MQ2BEICATbX41GE+LDH5I5K24eT6rCY/pkm7JOChD1WWTZn7XD 5wZqX8VM JJspMdRqL0NN31J+GaKpYD5rvGO3MSsfuSB2luZpeoEmLyOcvUJM3zKjWmNFbj3q+EKpi6Wja+74SC4M0mpKEU0lI4GEASD7KFMXCZQbjYi9tFwG8ciavW0PKjSaJIlQ0v2Hcg41pZOnArFGDX6ktlaPu3OXskDuiLvVJVNdl12niO47uzYf4w+h66/4y1TEqzOplBCxsYoQAPXHGjQwAEA+kDtR2p2OIRMFNKsQYNny8tSEiRekfWZtZoE8YDN8RE10uKCekKsggMxTwhsp8Kwk+GjcVmBSri1LpuqhUFmjlzRf0KKfxzj7uQFV2AN9Szv2jkv2ZY08fmiqQwGd1LrG7ay1igwr896Jxumeu8+f79uTJgDmaxWE8z45lGpOoPFbRt6/D4l3cWHaFTBMNCJPE491wkRdHZSfhXrmhIWx18pQ6W1YdmPMqHX0dRDVleFKAQuhW3eXA89y8kLu2b7LwRaYyiLlRmMa31TE7SAPzzDQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: CPU unplug first calls __cpu_disable(), and that's where powerpc calls cleanup_cpu_mmu_context(), which clears this CPU from mm_cpumask() of all mms in the system. However this CPU may still be using a lazy tlb mm, and its mm_cpumask bit will be cleared from it. The CPU does not switch away from the lazy tlb mm until arch_cpu_idle_dead() calls idle_task_exit(). If that user mm exits in this window, it will not be subject to the lazy tlb mm shootdown and may be freed while in use as a lazy mm by the CPU that is being unplugged. cleanup_cpu_mmu_context() could be moved later, but it looks better to move the lazy tlb mm switching earlier. The problem with doing the lazy mm switching in idle_task_exit() is explained in commit bf2c59fce4074 ("sched/core: Fix illegal RCU from offline CPUs"), which added a wart to switch away from the mm but leave it set in active_mm to be cleaned up later. So instead, switch away from the lazy tlb mm on the stopper kthread before the CPU is taken down. This CPU will never switch to a user thread from this point, so it has no chance to pick up a new lazy tlb mm. This removes the lazy tlb mm handling wart in CPU unplug. idle_task_exit() remains to reduce churn in the patch. It could be removed entirely after this because finish_cpu() makes a similar check. finish_cpu() itself is not strictly needed because init_mm will never have its refcount drop to zero. But it is conceptually nicer to keep it rather than have the idle thread drop the reference on the mm it is using. Fixes: 2655421ae69fa ("lazy tlb: shoot lazies, non-refcounting lazy tlb mm reference handling scheme") Signed-off-by: Nicholas Piggin --- include/linux/sched/hotplug.h | 2 ++ kernel/cpu.c | 11 +++++++---- kernel/sched/core.c | 24 +++++++++++++++++++----- 3 files changed, 28 insertions(+), 9 deletions(-) diff --git a/include/linux/sched/hotplug.h b/include/linux/sched/hotplug.h index 412cdaba33eb..cb447d8e3f9a 100644 --- a/include/linux/sched/hotplug.h +++ b/include/linux/sched/hotplug.h @@ -19,8 +19,10 @@ extern int sched_cpu_dying(unsigned int cpu); #endif #ifdef CONFIG_HOTPLUG_CPU +extern void idle_task_prepare_exit(void); extern void idle_task_exit(void); #else +static inline void idle_task_prepare_exit(void) {} static inline void idle_task_exit(void) {} #endif diff --git a/kernel/cpu.c b/kernel/cpu.c index f4a2c5845bcb..584def27ff24 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -618,12 +618,13 @@ static int finish_cpu(unsigned int cpu) struct mm_struct *mm = idle->active_mm; /* - * idle_task_exit() will have switched to &init_mm, now - * clean up any remaining active_mm state. + * idle_task_prepare_exit() ensured the idle task was using + * &init_mm. Now that the CPU has stopped, drop that refcount. */ - if (mm != &init_mm) - idle->active_mm = &init_mm; + WARN_ON(mm != &init_mm); + idle->active_mm = NULL; mmdrop_lazy_tlb(mm); + return 0; } @@ -1030,6 +1031,8 @@ static int take_cpu_down(void *_param) enum cpuhp_state target = max((int)st->target, CPUHP_AP_OFFLINE); int err, cpu = smp_processor_id(); + idle_task_prepare_exit(); + /* Ensure this CPU doesn't handle any more interrupts. */ err = __cpu_disable(); if (err < 0) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index a68d1276bab0..bc4ef1f3394b 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -9373,19 +9373,33 @@ void sched_setnuma(struct task_struct *p, int nid) * Ensure that the idle task is using init_mm right before its CPU goes * offline. */ -void idle_task_exit(void) +void idle_task_prepare_exit(void) { struct mm_struct *mm = current->active_mm; - BUG_ON(cpu_online(smp_processor_id())); - BUG_ON(current != this_rq()->idle); + WARN_ON(!irqs_disabled()); if (mm != &init_mm) { - switch_mm(mm, &init_mm, current); + mmgrab_lazy_tlb(&init_mm); + current->active_mm = &init_mm; + switch_mm_irqs_off(mm, &init_mm, current); finish_arch_post_lock_switch(); + mmdrop_lazy_tlb(mm); } + /* finish_cpu() will mmdrop the init_mm ref after this CPU stops */ +} + +/* + * After the CPU is offline, double check that it was previously switched to + * init_mm. This call can be removed because the condition is caught in + * finish_cpu() as well. + */ +void idle_task_exit(void) +{ + BUG_ON(cpu_online(smp_processor_id())); + BUG_ON(current != this_rq()->idle); - /* finish_cpu(), as ran on the BP, will clean up the active_mm state */ + WARN_ON_ONCE(current->active_mm != &init_mm); } static int __balance_push_cpu_stop(void *arg) From patchwork Wed May 24 06:04:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 13253280 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C1291C77B7A for ; Wed, 24 May 2023 06:05:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 560C6280001; Wed, 24 May 2023 02:05:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4E985900002; Wed, 24 May 2023 02:05:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 33CDB280001; Wed, 24 May 2023 02:05:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 1FA3E900002 for ; Wed, 24 May 2023 02:05:14 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id E9BFD1C768F for ; Wed, 24 May 2023 06:05:13 +0000 (UTC) X-FDA: 80824110906.15.B3441FD Received: from mail-pf1-f173.google.com (mail-pf1-f173.google.com [209.85.210.173]) by imf29.hostedemail.com (Postfix) with ESMTP id 1E398120013 for ; Wed, 24 May 2023 06:05:11 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b=XZsStjLC; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf29.hostedemail.com: domain of npiggin@gmail.com designates 209.85.210.173 as permitted sender) smtp.mailfrom=npiggin@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1684908312; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=KNLXI/ZXaW5BbdyJvbyW+B6lruy4aiQp89HdExaBNqE=; b=nVZMqHbmUrnJnSmw7w1a3saCAQpn8R4vjCWxk4oGPK7U/dgi1uV4uS7EPiSe1WIUah4AaD 6UIGjsXLCVeflUnuP5KzpeZPmp42YodB2s7eCxMXglGrDl9HXSs6nW1tgWy1wc4LaZ13fu MA5bbM7j/+7v8l09yy4/XHrPZM6Pdhw= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b=XZsStjLC; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf29.hostedemail.com: domain of npiggin@gmail.com designates 209.85.210.173 as permitted sender) smtp.mailfrom=npiggin@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1684908312; a=rsa-sha256; cv=none; b=G4lndIkcvkyE7LGqf5UM8GGPNJdwILK+c9Z25vG3a/o6IPH8m6DgTRnsbV9RrYYvhqZ4hJ +IlIuz6f06uRKuaUMtaKxySl/0bDTKQzBAELPzTPi07DkFsNiL5+sRo1AdcgS9256R8nAq V5x3U3zMySNbaU0RfsNI1QB2ZWOa+Ck= Received: by mail-pf1-f173.google.com with SMTP id d2e1a72fcca58-64d341bdedcso362911b3a.3 for ; Tue, 23 May 2023 23:05:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684908311; x=1687500311; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=KNLXI/ZXaW5BbdyJvbyW+B6lruy4aiQp89HdExaBNqE=; b=XZsStjLCv18FM2g7O3gXy/8cWdYtOMZq5OMcEmXO8qgBTvKakpexXgr6gxA973G4a9 ULcQPjBmdUtrgKCKx8Nnm+RYUVrUyFoqIMf7qOjcC0OAc6puDsd6I5yosXHIpOhFPfNS GS8UlLWA5yPc2pAzipi8vy1aBvVq8raTYwBMy7y8BeCPni/J5UsCM2os9B0DyFh0hYts cssUlCXS9k8JiV0husJODnB0hbUS/mkq6MNQ8K5MvipAa7Pn1cXTURPGRUBbxUqLGojr AWfnlSvRXfQvNucyl3Oe66sXszCd3uqR8kRLBhTJsab/b2kMzLp3rtpFeb0QCwW0xQik q7sg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684908311; x=1687500311; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=KNLXI/ZXaW5BbdyJvbyW+B6lruy4aiQp89HdExaBNqE=; b=dIi8YRN/b6gAO2IeC4EUpO2OJyWVjxqZpkk654nEzsOySTb4KQc/vgUxvp5BUZ1B1u Df8Ei0HQpQaNci5kRPnIHyWGCsFL7XQPIlF1TB+9sCSH2TOLd0DW1uvrSn/WQqeOv0l2 evme7Pmz/R5elLcnIdYFZsY+m8rhivv0xwMG0a7qPHb9/tGf0+6091VpUmGDyAS0LZ7W +4iRG9Hb9CbmSFC4NEZXMesYj7vrSSHcCoX3yGjJIAYcGDROeV69+sceAU5m5rZ2FdrA 8mpEcJvo21g2BWPgyhmtqf/xUGY9Be+0RGmAQ/ebL9wkj6JbE9I3doLOX8tXdCe6vpo5 6Vhg== X-Gm-Message-State: AC+VfDysVSc1RiiUd6gNO5RKOVtLGVetsVjHlmLVLMqcrOo+8gANxWFO M14M/1UeyM5ccbW8XC3ibdo= X-Google-Smtp-Source: ACHHUZ4irL3h3o0BuihBytIxkgP7P3dYhPfYa99TuQfqazLCyVb5fcQLFaIev+2r8sVmMIute6392A== X-Received: by 2002:a05:6a00:2287:b0:63b:19e5:a9ec with SMTP id f7-20020a056a00228700b0063b19e5a9ecmr2053192pfe.33.1684908310708; Tue, 23 May 2023 23:05:10 -0700 (PDT) Received: from wheely.local0.net ([220.240.241.243]) by smtp.gmail.com with ESMTPSA id k25-20020aa792d9000000b00646e7d2b5a7sm6678426pfa.112.2023.05.23.23.05.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 May 2023 23:05:10 -0700 (PDT) From: Nicholas Piggin To: Andrew Morton Cc: Nicholas Piggin , Linus Torvalds , Peter Zijlstra , linuxppc-dev@lists.ozlabs.org, linux-mm@kvack.org Subject: [PATCH 2/2] lazy tlb: consolidate lazy tlb mm switching Date: Wed, 24 May 2023 16:04:55 +1000 Message-Id: <20230524060455.147699-2-npiggin@gmail.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230524060455.147699-1-npiggin@gmail.com> References: <20230524060455.147699-1-npiggin@gmail.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 1E398120013 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: a8zsrqfgg99owbnxgu97fgen3f3mb3q7 X-HE-Tag: 1684908311-778261 X-HE-Meta: U2FsdGVkX1+5tBZ432ZDGjBs36+FDGiJKiusdiX0Isu86NBD11uiu4zLFUhnBdg+OhCAj5aktNIgbNh9jddNhoq21PZxdTWxz6rNK5/sfk8xncYtVE+spoZ27FjbyplkW+x6nU4HNwXcfFZkb81/Av9iqcPpxzNAqYbXUPYOqhns7Hu8CLnWC/5fpQNpd9TjFhYjaemRknlfxgvYEuMLrBJYDhk/yBWBizgAGcT/TvRimvUO7kaVGDyTOn5d71faymk9MAatdPyikLfI7wE0fwiw0+rWpcBcq9viJcL7KqWbclUt9p0TYns17hJSlh086Ir9Rlekw7fPwmPt+BiHVSNcmCqNYhfO2isO51TI2RugtAzRQKoIxep85iCdVUpgdu2MiDTK6OCiK1VB/uGp2ciIZzry/AELN2VBbHRxcvv5bekEjrLkA7lC3LfCpL3/IQArKWJa9P/4YLkbQJxbtTocsxX9BAMaSvzfrTDsH3TQKPlN6D6KQDmKdRPVW4LPStmtQzxP9gIrpwjuGlA5bPQreBIjBxqGmiCKzT125mKvQced/exRGj/7iLERGa1IR3/9ZC70epbyeerqZcLNetEZlukCD5LFhUvke7g027ipFpSUl1x9Hy43Rgc6R4Yug+v3o/hiDgjwcJJFMKZTm1GQ57gsQz2+3VH0l2nwWwDP2V7qrFQjajV6/aVLmWhiBEaiNMnnc825SQskCIhFG6/VeKA3TOim3wiD+MhwLhn/bU5z/7O9H+tLY38q4rPtgoGP1r9CZCCSANBLqB209FpV0jH93TdJyTSrzcaBdA4IlUFp/8Pti/08kiB418jMJ2Wl+YOJ44M44moHlqp2JaZo+tjJ0Eag0ZYgrM05+M3wSVXDdEapKsBeKjGb+WT+675JZeL3u8ar6BCBqK5oUERG/o12MRd9+i3tDEp6Y5iZG0Pwcp2NX7EUveR2xwg9Pxo4VN4yT/BKb3CL6Te CSB8WPAb zRdO5mQLnMb//kkzZHovzD/lQipRziE1yCSvCBxTWkwS4NNLpAzfjpnKj3HQm7O/6wKyN7D8sj+ZNgLKdkDNjl7qwB7ezIvH+wUI4mt3Du9RPR1AnbRPlP/v6gwjq7WoBXAhCl5v9DGIEi07OFMrlJOm8EX9ZqbkP9iOJ7+QLicjxlybALmm8owpVXE5lQYmmigBOA9dYjkQmf3UkPoAbjbKHG/CuIAEjzvjdk9pF7aBzxPgkYGN0RtmRLy8xx6msn3vCs2QA5rH31nd8qHcH7CmYw1KuI7xi28q76xf9rWUK5OI5953vOzNox4RkZ1b5gMjhN3eeBYZyznpxLpHHSHjfZTAwYQMZZ8U2G8fQvCA+GfCTaGYj114OQxXqh4bX0xF3Qh1yIdRLi9YNd8HNXai8Gtfiav+vjzTLRFpdEokPNYx+pmItJ1bad+xGmZHGEbNz X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Switching a kernel thread using a "lazy tlb mm" to init_mm is a relatively common sequence that is not quite trivial. Consolidate this into a function. This fixes a bug in do_shoot_lazy_tlb() for any arch that implements finish_arch_post_lock_switch(). None select MMU_LAZY_TLB_SHOOTDOWN at the moment. Fixes: 2655421ae69fa ("lazy tlb: shoot lazies, non-refcounting lazy tlb mm reference handling scheme") Signed-off-by: Nicholas Piggin --- arch/powerpc/mm/book3s64/radix_tlb.c | 6 +---- include/linux/sched/task.h | 2 ++ kernel/fork.c | 7 ++---- kernel/sched/core.c | 34 ++++++++++++++++++++-------- 4 files changed, 29 insertions(+), 20 deletions(-) diff --git a/arch/powerpc/mm/book3s64/radix_tlb.c b/arch/powerpc/mm/book3s64/radix_tlb.c index ce804b7bf84e..90953cf9f648 100644 --- a/arch/powerpc/mm/book3s64/radix_tlb.c +++ b/arch/powerpc/mm/book3s64/radix_tlb.c @@ -795,12 +795,8 @@ void exit_lazy_flush_tlb(struct mm_struct *mm, bool always_flush) goto out; if (current->active_mm == mm) { - WARN_ON_ONCE(current->mm != NULL); /* Is a kernel thread and is using mm as the lazy tlb */ - mmgrab_lazy_tlb(&init_mm); - current->active_mm = &init_mm; - switch_mm_irqs_off(mm, &init_mm, current); - mmdrop_lazy_tlb(mm); + kthread_end_lazy_tlb_mm(); } /* diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h index 537cbf9a2ade..23693b94a09b 100644 --- a/include/linux/sched/task.h +++ b/include/linux/sched/task.h @@ -61,6 +61,8 @@ extern int lockdep_tasklist_lock_is_held(void); extern asmlinkage void schedule_tail(struct task_struct *prev); extern void init_idle(struct task_struct *idle, int cpu); +extern void kthread_end_lazy_tlb_mm(void); + extern int sched_fork(unsigned long clone_flags, struct task_struct *p); extern void sched_cgroup_fork(struct task_struct *p, struct kernel_clone_args *kargs); extern void sched_post_fork(struct task_struct *p); diff --git a/kernel/fork.c b/kernel/fork.c index ed4e01daccaa..8b005c2c7c3c 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -854,11 +854,8 @@ static void do_shoot_lazy_tlb(void *arg) { struct mm_struct *mm = arg; - if (current->active_mm == mm) { - WARN_ON_ONCE(current->mm); - current->active_mm = &init_mm; - switch_mm(mm, &init_mm, current); - } + if (current->active_mm == mm) + kthread_end_lazy_tlb_mm(); } static void cleanup_lazy_tlbs(struct mm_struct *mm) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index bc4ef1f3394b..71706df22b41 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -5346,6 +5346,29 @@ context_switch(struct rq *rq, struct task_struct *prev, return finish_task_switch(prev); } +/* + * If this kthread has a user process's mm for its active_mm (aka lazy tlb mm) + * then switch away from it, to init_mm. Must not be called while using an + * mm with kthread_use_mm(). + */ +void kthread_end_lazy_tlb_mm(void) +{ + struct mm_struct *mm = current->active_mm; + + WARN_ON_ONCE(!irqs_disabled()); + + if (WARN_ON_ONCE(current->mm)) + return; /* Not a kthread or doing kthread_use_mm */ + + if (mm != &init_mm) { + mmgrab_lazy_tlb(&init_mm); + current->active_mm = &init_mm; + switch_mm_irqs_off(mm, &init_mm, current); + finish_arch_post_lock_switch(); + mmdrop_lazy_tlb(mm); + } +} + /* * nr_running and nr_context_switches: * @@ -9375,17 +9398,8 @@ void sched_setnuma(struct task_struct *p, int nid) */ void idle_task_prepare_exit(void) { - struct mm_struct *mm = current->active_mm; - WARN_ON(!irqs_disabled()); - - if (mm != &init_mm) { - mmgrab_lazy_tlb(&init_mm); - current->active_mm = &init_mm; - switch_mm_irqs_off(mm, &init_mm, current); - finish_arch_post_lock_switch(); - mmdrop_lazy_tlb(mm); - } + kthread_end_lazy_tlb_mm(); /* finish_cpu() will mmdrop the init_mm ref after this CPU stops */ }