From patchwork Wed Nov 24 19:20:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Catalin Marinas X-Patchwork-Id: 12693574 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 70D6FC433F5 for ; Wed, 24 Nov 2021 19:21:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:Message-Id:Date:Subject:Cc :To:From:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References: List-Owner; bh=OsLnp533vBiDBDMs4HXz0wgXFmr2YWZ8uKm7qftiNdk=; b=Y24Gd2dHhkgC5t Cky82bX+k2cg+F5+i1ZC2OTMs6NKizzxFX1prbLxW4+fR/kJ3vem/L2yuebD2ZF0aM6mrXa0tAleJ iFoAZZv5enpB3ton9mzwwHK3YtFDqi8f+C9mjDbsEPt3Uf3B5LuklXMORz22BDnY+2UFr2DwpOPh6 wsXT7ICXUHvxKlSCR0iQoY9iHExi3V9Uucmj95jSSu08kFb+zM27LaSmZgK3cMEfhJPGlsYHYanZM Pftm00qmWNhD/C2vk+d/b1gMhH6RF0YVKr3aFszdSAUxUJVckdPWbz/KjE+yFo8G5Nzg97I03aT// iMCcMYJv3Dk7XurPNv0g==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1mpxoo-005dHr-28; Wed, 24 Nov 2021 19:20:34 +0000 Received: from mail.kernel.org ([198.145.29.99]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1mpxoj-005dGo-Pn for linux-arm-kernel@lists.infradead.org; Wed, 24 Nov 2021 19:20:31 +0000 Received: by mail.kernel.org (Postfix) with ESMTPSA id E8AB0604DA; Wed, 24 Nov 2021 19:20:26 +0000 (UTC) From: Catalin Marinas To: Linus Torvalds , Josef Bacik , David Sterba Cc: Andreas Gruenbacher , Al Viro , Andrew Morton , Will Deacon , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-btrfs@vger.kernel.org Subject: [PATCH 0/3] Avoid live-lock in fault-in+uaccess loops with sub-page faults Date: Wed, 24 Nov 2021 19:20:21 +0000 Message-Id: <20211124192024.2408218-1-catalin.marinas@arm.com> X-Mailer: git-send-email 2.30.2 MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20211124_112029_884313_828B85E5 X-CRM114-Status: GOOD ( 15.36 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Hi, There are a few places in the filesystem layer where a uaccess is performed in a loop with page faults disabled, together with a fault_in_*() call to pre-fault the pages. On architectures like arm64 with MTE (memory tagging extensions) or SPARC ADI, even if the fault_in_*() succeeded, the uaccess can still fault indefinitely. In general this is not an issue since such code restarts the fault_in_*() from where the uaccess failed, therefore guaranteeing forward progress. The btrfs search_ioctl(), however, rewinds the fault_in_*() position and it can live-lock. This was reported by Al here: https://lore.kernel.org/r/YSqOUb7yZ7kBoKRY@zeniv-ca.linux.org.uk There's also an analysis by Al of other fault-in places: https://lore.kernel.org/r/YSldx9uhMYhT/G8X@zeniv-ca.linux.org.uk and another sub-thread on the same topic: https://lore.kernel.org/r/YXBFqD9WVuU8awIv@arm.com So far only btrfs search_ioctl() seems to be affected and that's what this series addresses. The existing loops like generic_perform_write() already guarantee forward progress. Andreas raised a concern about O_DIRECT accesses since on fault the user address is rewound to a block size boundary. I tried ext4, btrfs and gfs2 and I could not get any of them to live-lock. Depending on the alignment of the user buffer (page or not), I found two behaviours: - the copy to or from the user buffer succeeds entirely if it goes through the kernel mapping (GUP, kmap'ed page; user MTE tags are not checked) or - the copy partially succeeds after a few attempts at uaccess on the faulting same address (the highest number of attempts in my tests was 11 with btrfs). Given the high cost of such sub-page probing (which is done prior to the uaccess) my proposal is to only change the btrfs search_ioctl() (as per the last patch). We can extend the API and call places in the future if needed but I hope filesystems already deal with this in other ways. Thanks. Catalin Marinas (3): mm: Introduce fault_in_exact_writeable() to probe for sub-page faults arm64: Add support for sub-page faults user probing btrfs: Avoid live-lock in search_ioctl() on hardware with sub-page faults arch/Kconfig | 7 +++++++ arch/arm64/Kconfig | 1 + arch/arm64/include/asm/uaccess.h | 33 ++++++++++++++++++++++++++++++++ fs/btrfs/ioctl.c | 3 ++- include/linux/pagemap.h | 1 + include/linux/uaccess.h | 21 ++++++++++++++++++++ mm/gup.c | 19 ++++++++++++++++++ 7 files changed, 84 insertions(+), 1 deletion(-)