From patchwork Thu May 6 23:25:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12243601 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-14.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 24E27C433ED for ; Thu, 6 May 2021 23:25:45 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 91F91613B5 for ; Thu, 6 May 2021 23:25:44 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 91F91613B5 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 0CCB06B0070; Thu, 6 May 2021 19:25:44 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 088D96B0071; Thu, 6 May 2021 19:25:44 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E0FF66B0072; Thu, 6 May 2021 19:25:43 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0112.hostedemail.com [216.40.44.112]) by kanga.kvack.org (Postfix) with ESMTP id C53296B0070 for ; Thu, 6 May 2021 19:25:43 -0400 (EDT) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 7D35F824999B for ; Thu, 6 May 2021 23:25:43 +0000 (UTC) X-FDA: 78112390566.30.18222BC Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf18.hostedemail.com (Postfix) with ESMTP id 676D42000256 for ; Thu, 6 May 2021 23:25:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1620343542; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=azAw/6/s9zM61LysgSxjopQv4ixd+cAOvC5vtHw3Osg=; b=HXIFjVOGBOVldWxgJtmudlcxbdpiExOob/NxrSIXsyLcIsbn9JOmw/xoTKLfmMH0d71e9w HwrVc2uGTFzql21g7Nqst94bAetzHWtrdf+BBeqrH79n4yKKnPtL32+NDW4gKyU6VmoVas fDgTrhYiWUXN2gV1lzay8zpoYp9wUto= Received: from mail-qv1-f69.google.com (mail-qv1-f69.google.com [209.85.219.69]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-276-6_22zM33PsKUbk2v7lkliA-1; Thu, 06 May 2021 19:25:40 -0400 X-MC-Unique: 6_22zM33PsKUbk2v7lkliA-1 Received: by mail-qv1-f69.google.com with SMTP id b10-20020a0cf04a0000b02901bda1df3afbso5352097qvl.13 for ; Thu, 06 May 2021 16:25:40 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=/ULBbnWzA4nZLVuFHzAZkrdA4s37J6kM0ZzmRH1eA7Q=; b=TX6LQopOgQtz4ZwuwRfBQpctgVu6ACv2dHx5TV2dxf/vQnr1kibUBQyfHYrM/D6aGy sXteJtpGMS+qlSwsR5ODERSIf9lK1lUhkCpFGdBWOV5Zn4ZUSfPqq6POwp+H9XvNrEBO B97q5YbZsT2zlAfl5XgKvKBpBvuVKi7VWLQAPvoso5jUWdR67Ib16p9siWWY9ZeJEa0M GGXmitJ4ZJxUMZXUXRPCvHPSIzGzRnKfo9P7VPnvtkDx0m3UkU9sRx6wJ0du4J9ZmB90 1J686d7KC4qJ2SirtsG1M9E4hpsPh9PS3E4p7w585GYWpePGIZl/mlHGNP+i+/F9qrhA /sDw== X-Gm-Message-State: AOAM533nDZfgjyh0Ebc4P5HZ8W5a61tiisKCBwZwShmq8WUCl76f1Ye0 TatsxXhNpuJMn+g0VhHlzuUdEZTPfoiEgNpaGrQNi5cdvmLHCqIT+jRMAAzcDgQZdH9RPq7EqeK C7gHbXRHgs8TKWMl+y6kq1HBOQwYVINHGNw6Awor6seGuG4MhhtDWVA+eklIa X-Received: by 2002:ac8:7512:: with SMTP id u18mr6832340qtq.204.1620343539674; Thu, 06 May 2021 16:25:39 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyJ8E2EpRBlSQsJ49ibReWq9J6BbHBBmWbA6e8AmNR3VLNuKeUicZWZ6smdjO7PSrUeoluZ3w== X-Received: by 2002:ac8:7512:: with SMTP id u18mr6832300qtq.204.1620343539267; Thu, 06 May 2021 16:25:39 -0700 (PDT) Received: from t490s.redhat.com (bras-base-toroon474qw-grc-72-184-145-4-219.dsl.bell.ca. [184.145.4.219]) by smtp.gmail.com with ESMTPSA id q13sm1605026qkn.10.2021.05.06.16.25.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 May 2021 16:25:38 -0700 (PDT) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Hugh Dickins , peterx@redhat.com, John Hubbard , Jan Kara , Kirill Shutemov , Jason Gunthorpe , Andrew Morton , Kirill Tkhai , Michal Hocko , Oleg Nesterov , Jann Horn , Linus Torvalds , Matthew Wilcox , Andrea Arcangeli Subject: [PATCH 0/3] mm/gup: Fix pin page write cache bouncing on has_pinned Date: Thu, 6 May 2021 19:25:34 -0400 Message-Id: <20210506232537.165788-1-peterx@redhat.com> X-Mailer: git-send-email 2.31.1 MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 676D42000256 X-Stat-Signature: ogy4rt56mgmf6kdis69hgw4q8195x8mf Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=HXIFjVOG; spf=none (imf18.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com Received-SPF: none (redhat.com>: No applicable sender policy available) receiver=imf18; identity=mailfrom; envelope-from=""; helo=us-smtp-delivery-124.mimecast.com; client-ip=170.10.133.124 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1620343544-989103 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This series contains 3 patches, the 1st one enables threading for gup_benchma= rk in the kselftest. The latter two patches are collected from Andrea's local branch which can fix write cache bouncing issue with pinning fast-gup. To be explicit on the latter two patches: - the 2nd patch fixes the perf degrade when introducing has_pinned, then - the last patch tries to remove the has_pinned with a bit in mm->flags For patch 3: originally I think we had a plan to reuse has_pinned into a counter very soon, however that's not happening at least until today, so maybe it proves that we can remove it until we really want such a counter for whatever reason. As the commit message stated, it saves 4 bytes for each mm without observable regressions. Regarding testing: we can reference to the commit message of patch 2 for some detailed testing with will-is-scale. Meanwhile I did patch 1 just because th= en we can even easily verify the patchset using the existing kselftest facilities or even regress test it in the future with the repo if we want. Below numbers are extra verification tests that I did besides commit message = of patch 2 using the new gup_benchmark and 256 cpus. Below test is done on 40 cpus host with Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz, and I can get simil= ar result (of course the write cache bouncing get severe with even more cores). After patch 1 applied (only test patch, so using old kernel): $ sudo chrt -f 1 ./gup_test -a -m 512 -j 40 PIN_FAST_BENCHMARK: Time: get:459632 put:5990 us PIN_FAST_BENCHMARK: Time: get:461967 put:5840 us PIN_FAST_BENCHMARK: Time: get:464521 put:6140 us PIN_FAST_BENCHMARK: Time: get:465176 put:7100 us PIN_FAST_BENCHMARK: Time: get:465960 put:6733 us PIN_FAST_BENCHMARK: Time: get:465324 put:6781 us PIN_FAST_BENCHMARK: Time: get:466018 put:7130 us PIN_FAST_BENCHMARK: Time: get:466362 put:7118 us PIN_FAST_BENCHMARK: Time: get:465118 put:6975 us PIN_FAST_BENCHMARK: Time: get:466422 put:6602 us PIN_FAST_BENCHMARK: Time: get:465791 put:6818 us PIN_FAST_BENCHMARK: Time: get:467091 put:6298 us PIN_FAST_BENCHMARK: Time: get:467694 put:5432 us PIN_FAST_BENCHMARK: Time: get:469575 put:5581 us PIN_FAST_BENCHMARK: Time: get:468124 put:6055 us PIN_FAST_BENCHMARK: Time: get:468877 put:6720 us PIN_FAST_BENCHMARK: Time: get:467212 put:4961 us PIN_FAST_BENCHMARK: Time: get:467834 put:6697 us PIN_FAST_BENCHMARK: Time: get:470778 put:6398 us PIN_FAST_BENCHMARK: Time: get:469788 put:6310 us PIN_FAST_BENCHMARK: Time: get:488277 put:7113 us PIN_FAST_BENCHMARK: Time: get:486613 put:7085 us PIN_FAST_BENCHMARK: Time: get:486940 put:7202 us PIN_FAST_BENCHMARK: Time: get:488728 put:7101 us PIN_FAST_BENCHMARK: Time: get:487570 put:7327 us PIN_FAST_BENCHMARK: Time: get:489260 put:7027 us PIN_FAST_BENCHMARK: Time: get:488846 put:6866 us PIN_FAST_BENCHMARK: Time: get:488521 put:6745 us PIN_FAST_BENCHMARK: Time: get:489950 put:6459 us PIN_FAST_BENCHMARK: Time: get:489777 put:6617 us PIN_FAST_BENCHMARK: Time: get:488224 put:6591 us PIN_FAST_BENCHMARK: Time: get:488644 put:6477 us PIN_FAST_BENCHMARK: Time: get:488754 put:6711 us PIN_FAST_BENCHMARK: Time: get:488875 put:6743 us PIN_FAST_BENCHMARK: Time: get:489290 put:6657 us PIN_FAST_BENCHMARK: Time: get:490264 put:6684 us PIN_FAST_BENCHMARK: Time: get:489631 put:6737 us PIN_FAST_BENCHMARK: Time: get:488434 put:6655 us PIN_FAST_BENCHMARK: Time: get:492213 put:6297 us PIN_FAST_BENCHMARK: Time: get:491124 put:6173 us After the whole series applied (new fixed kernel): $ sudo chrt -f 1 ./gup_test -a -m 512 -j 40 PIN_FAST_BENCHMARK: Time: get:82038 put:7041 us PIN_FAST_BENCHMARK: Time: get:82144 put:6817 us PIN_FAST_BENCHMARK: Time: get:83417 put:6674 us PIN_FAST_BENCHMARK: Time: get:82540 put:6594 us PIN_FAST_BENCHMARK: Time: get:83214 put:6681 us PIN_FAST_BENCHMARK: Time: get:83444 put:6889 us PIN_FAST_BENCHMARK: Time: get:83194 put:7499 us PIN_FAST_BENCHMARK: Time: get:84876 put:7369 us PIN_FAST_BENCHMARK: Time: get:86092 put:10289 us PIN_FAST_BENCHMARK: Time: get:86153 put:10415 us PIN_FAST_BENCHMARK: Time: get:85026 put:7751 us PIN_FAST_BENCHMARK: Time: get:85458 put:7944 us PIN_FAST_BENCHMARK: Time: get:85735 put:8154 us PIN_FAST_BENCHMARK: Time: get:85851 put:8299 us PIN_FAST_BENCHMARK: Time: get:86323 put:9617 us PIN_FAST_BENCHMARK: Time: get:86288 put:10496 us PIN_FAST_BENCHMARK: Time: get:87697 put:9346 us PIN_FAST_BENCHMARK: Time: get:87980 put:8382 us PIN_FAST_BENCHMARK: Time: get:88719 put:8400 us PIN_FAST_BENCHMARK: Time: get:87616 put:8588 us PIN_FAST_BENCHMARK: Time: get:86730 put:9563 us PIN_FAST_BENCHMARK: Time: get:88167 put:8673 us PIN_FAST_BENCHMARK: Time: get:86844 put:9777 us PIN_FAST_BENCHMARK: Time: get:88068 put:11774 us PIN_FAST_BENCHMARK: Time: get:86170 put:15676 us PIN_FAST_BENCHMARK: Time: get:87967 put:12827 us PIN_FAST_BENCHMARK: Time: get:95773 put:7652 us PIN_FAST_BENCHMARK: Time: get:87734 put:13650 us PIN_FAST_BENCHMARK: Time: get:89833 put:14237 us PIN_FAST_BENCHMARK: Time: get:96186 put:8029 us PIN_FAST_BENCHMARK: Time: get:95532 put:8886 us PIN_FAST_BENCHMARK: Time: get:95351 put:5826 us PIN_FAST_BENCHMARK: Time: get:96401 put:8407 us PIN_FAST_BENCHMARK: Time: get:96473 put:8287 us PIN_FAST_BENCHMARK: Time: get:97177 put:8430 us PIN_FAST_BENCHMARK: Time: get:98120 put:5263 us PIN_FAST_BENCHMARK: Time: get:96271 put:7757 us PIN_FAST_BENCHMARK: Time: get:99628 put:10467 us PIN_FAST_BENCHMARK: Time: get:99344 put:10045 us PIN_FAST_BENCHMARK: Time: get:94212 put:15485 us Summary: Old kernel: 477729.97 (+-3.79%) New kernel: 89144.65 (+-11.76%) I'm not sure whether I should add Fixes for patch 2. If to add it'll be: Fixes: 008cfe4418b3d ("mm: Introduce mm_struct.has_pinned") Then cc stable for 5.9+. However I'll skip adding it if no one asks, as this is a perf fix, and frequent+concurrent pinning should not really happen that = much. Please review, thanks. Andrea Arcangeli (2): mm: gup: allow FOLL_PIN to scale in SMP mm: gup: pack has_pinned in MMF_HAS_PINNED Peter Xu (1): mm/gup_benchmark: Support threading fs/proc/task_mmu.c | 2 +- include/linux/mm.h | 2 +- include/linux/mm_types.h | 10 --- include/linux/sched/coredump.h | 1 + kernel/fork.c | 1 - mm/gup.c | 9 +-- tools/testing/selftests/vm/gup_test.c | 94 ++++++++++++++++++--------- 7 files changed, 71 insertions(+), 48 deletions(-) --=20 2.31.1