From patchwork Wed Oct 4 16:53:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wilcox X-Patchwork-Id: 13409074 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4255DE7C4E2 for ; Wed, 4 Oct 2023 16:53:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8018E6B027F; Wed, 4 Oct 2023 12:53:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7B3FB6B0281; Wed, 4 Oct 2023 12:53:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 653278D0002; Wed, 4 Oct 2023 12:53:32 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 570156B027F for ; Wed, 4 Oct 2023 12:53:32 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 0E7C512039D for ; Wed, 4 Oct 2023 16:53:32 +0000 (UTC) X-FDA: 81308375064.15.9CB09E8 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf29.hostedemail.com (Postfix) with ESMTP id 339FB120016 for ; Wed, 4 Oct 2023 16:53:29 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=JRt10tJu; dmarc=none; spf=none (imf29.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1696438410; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=LwA3p5YQj5OgBPSUkutIowOVEGM0rG3Lj5ULCLty+YM=; b=oYS9q7SQ+D3RA7TZcMKs2D2GjNE4Px2Vr+j+SIT7neIefD5IXzKcT9EzwdW9b1A61+McSw Ui8kvjIPZgTs96KK+l0HFBtzhiyDQGxPsUvbURVKLigF9ED8sDmM+FBmTZW+gPmWRYCL9F 3a46uRTwLqs+S0+BjFwzu4QQP6uFoDQ= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=JRt10tJu; dmarc=none; spf=none (imf29.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1696438410; a=rsa-sha256; cv=none; b=43SW4jLrwZj37yYySV7xozRSRHEw6NP3VVjC+ZFaQlxt0g3SjfxqGf+kXwz+NpF23lv0qm K61Ybn1iv6o38bL7nviWxGkeDKx2bBOJdtPM2U12G3aXZJCCZL2BMblJEr0IbnVzMTMtIU U2EA29oXCO0WCvmeJME3rdCA2xG5RQc= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:MIME-Version: Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To:Content-Type:Content-ID: Content-Description:In-Reply-To:References; bh=LwA3p5YQj5OgBPSUkutIowOVEGM0rG3Lj5ULCLty+YM=; b=JRt10tJuArdHNRRYBWcWAKTL3L 5qoqHs+L3muErMYTGOiagRQEu5W1rLCQDiFr3Iwo196Be3buwEBHUBuhyJ2lkzSo5boAHD6BpdQHv kqco3WhNwcfQdxHLgIKr6jj1hCDFmwbEOi7zalI+OyPKeGZXwcyxU39WAzL3812ODwKoz8UIhx/vv o61fvI+j3BZ7GT/v6+KeSDyP9XVgSoWDWfOCSoQLKWZ0jA4gDvO0Oce/vZbGPFASGWxPj7GtcqhGH SzFZ2r52dVlmq2d4eN2U6+hmOI9Y86s4spZlVWUiqtlfz7/s5/BMdU/trJw0lt93QR3U5OocxHR28 nsv+z4xA==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1qo57f-004SEr-45; Wed, 04 Oct 2023 16:53:19 +0000 From: "Matthew Wilcox (Oracle)" To: Andrew Morton Cc: "Matthew Wilcox (Oracle)" , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, torvalds@linux-foundation.org, npiggin@gmail.com Subject: [PATCH v2 00/17] Add folio_end_read Date: Wed, 4 Oct 2023 17:53:00 +0100 Message-Id: <20231004165317.1061855-1-willy@infradead.org> X-Mailer: git-send-email 2.37.1 MIME-Version: 1.0 X-Rspamd-Queue-Id: 339FB120016 X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: 7pmdgy6pi4rc5rz9mpqxrhj7cn4dx1o6 X-HE-Tag: 1696438409-124092 X-HE-Meta: U2FsdGVkX1+oIEdESuFHMJmHP1CMIjrxy0TMjScPwgZqYju8eYbwO+BP7TvtPcdj6VqBtnISmcTIfHJMWd0VFfpZkaoDbYivd6yYasmkDelxQIBDAwqOk1lZyKaHk88/4TiEtwntQtYbXWlsTnKFwZuqKSY34zM8Yk0pu+g7Z0jgqkXPQb7fcnxk1XXTY3UcQiyG73K05ns51TLk8rPKZdCwZRQMEr2dU/9etIDDx/+k/tZZ4v7zuilvngnBQuN4xvq0qDFv0/eZRm+riFc+T/+mnZUV1uypAwVl+fw+iWQAftdJRAdrSwp8kmTyLlvFdrSVBWufs9pNJ2GJ5syEpg7IW/+aecCMU1dccRF+xDqABN0ytE22Np/wzvtVdAcq4tyJB6pUHFg/ixy2TehU4SD+mHsIJQIHX6++NL+VlHU5pYfyeCRCuJlMHIpcEEBtbevENK06ThkcTnJ2H72M/s7RzUMw/GjHEVD4nGCQlN3Qq7YwXOwG8AMceBAA5S/OC6yUJlwvPiK0OclMDEky9tia6QXEZ4Elws7Da58HGyNTDBXr9AWvTJikavzDdSccljS6879ET1gkyVq0oseTecnLPtAlDTQFkjDucLP9g+eTbgO/0gCjAVmLy7yyUppHpQgChJk9ivUp72AMnKQIS2wZxWE6074axGbwHhUZ0cL0xW67gryZ4iMXfmRWiM76IfWZcdFKaQ612m4veNSccvvSsioHMaRowwQF2U4V8IlLoa7XTlibt7cbKG65XmWrpWi1XNV4QaIxsniRnjlcNCRBEBzi/Lwr7LGZp7Yl5oSYUxgg3jJ6P737LF94VztumMcYPicbpBxesMaViOFNmWl+U/VPgZppzmxzlKN+hRvEnUtwitWIlXkGIGA05VpZwRKKmKhPjs+Tj41g6VsRTnWtqyJaNSUqYmuudvsDKsxmiTa3Vw5EtrQr+tyXZDaOmgJrOY+L01kchTS5aWx n2P80Nlk YPA73AdLegu/8qNsqcyiT1VMgtPiVpcdYZ+qDuL22WmqjM+lKSpboQJzU/aL4CVrD6s6g50Sp5zfQMdCyAKGV4H3rmt7EOTpM2LkecNzm2HTrlM7Af3+Xvao65qPnThecsuzd3nwbejMWvvIS+ZePj8Y0evAr5k/D7eyBSp3SNyQ9/EQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The core of this patchset is the new folio_end_read() call which filesystems can use when finishing a page cache read instead of separate calls to mark the folio uptodate and unlock it. As an illustration of its use, I converted ext4, iomap & mpage; more can be converted. I think that's useful by itself, but the interesting optimisation is that we can implement that with a single XOR instruction that sets the uptodate bit, clears the lock bit, tests the waiter bit and provides a write memory barrier. That removes one memory barrier and one atomic instruction from each page read, which seems worth doing. That's in patch 15. The last two patches could be a separate series, but basically we can do the same thing with the writeback flag that we do with the unlock flag; clear it and test the waiters bit at the same time. v2: - Update to 6.6-rc4 - Simplify iomap's use of folio_end_read() as suggested by Linus - Fix weird Alpha assembly, as suggested by Linus - Implement xor_unlock_is_negative_byte for Coldfire - Add a likely() to folio_end_read() after studying the Coldfire assembly Matthew Wilcox (Oracle) (17): iomap: Hold state_lock over call to ifs_set_range_uptodate() iomap: Protect read_bytes_pending with the state_lock mm: Add folio_end_read() ext4: Use folio_end_read() buffer: Use folio_end_read() iomap: Use folio_end_read() bitops: Add xor_unlock_is_negative_byte() alpha: Implement xor_unlock_is_negative_byte m68k: Implement xor_unlock_is_negative_byte mips: Implement xor_unlock_is_negative_byte powerpc: Implement arch_xor_unlock_is_negative_byte on 32-bit riscv: Implement xor_unlock_is_negative_byte s390: Implement arch_xor_unlock_is_negative_byte mm: Delete checks for xor_unlock_is_negative_byte() mm: Add folio_xor_flags_has_waiters() mm: Make __end_folio_writeback() return void mm: Use folio_xor_flags_has_waiters() in folio_end_writeback() arch/alpha/include/asm/bitops.h | 20 +++++ arch/m68k/include/asm/bitops.h | 21 +++++ arch/mips/include/asm/bitops.h | 25 +++++- arch/mips/lib/bitops.c | 14 ++++ arch/powerpc/include/asm/bitops.h | 21 ++--- arch/riscv/include/asm/bitops.h | 12 +++ arch/s390/include/asm/bitops.h | 10 +++ arch/x86/include/asm/bitops.h | 11 ++- fs/buffer.c | 16 +--- fs/ext4/readpage.c | 14 +--- fs/iomap/buffered-io.c | 57 ++++++++------ .../asm-generic/bitops/instrumented-lock.h | 28 ++++--- include/asm-generic/bitops/lock.h | 20 +---- include/linux/page-flags.h | 19 +++++ include/linux/pagemap.h | 1 + kernel/kcsan/kcsan_test.c | 9 +-- kernel/kcsan/selftest.c | 9 +-- mm/filemap.c | 77 ++++++++++--------- mm/kasan/kasan_test.c | 8 +- mm/page-writeback.c | 35 ++++----- 20 files changed, 255 insertions(+), 172 deletions(-)