From patchwork Sat Dec 24 08:12:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zach O'Keefe X-Patchwork-Id: 13081291 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0E2B1C4167B for ; Sat, 24 Dec 2022 08:12:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 33122940009; Sat, 24 Dec 2022 03:12:20 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2E14B900004; Sat, 24 Dec 2022 03:12:20 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1A8F2940009; Sat, 24 Dec 2022 03:12:20 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 0A9FC900004 for ; Sat, 24 Dec 2022 03:12:20 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 529AE809E6 for ; Sat, 24 Dec 2022 08:12:18 +0000 (UTC) X-FDA: 80276482356.14.5922E4C Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) by imf03.hostedemail.com (Postfix) with ESMTP id C137C20008 for ; Sat, 24 Dec 2022 08:12:16 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=KvJVzUtz; spf=pass (imf03.hostedemail.com: domain of 3X7SmYwcKCIY9yuoopoqyyqvo.mywvsx47-wwu5kmu.y1q@flex--zokeefe.bounces.google.com designates 209.85.214.202 as permitted sender) smtp.mailfrom=3X7SmYwcKCIY9yuoopoqyyqvo.mywvsx47-wwu5kmu.y1q@flex--zokeefe.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1671869536; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=CLNnJd70ppyA/NycQ0X3RpMhQCUPbfoR/CeGZ2SECtc=; b=WIYjLrnit9Rmd92PKybwjVXayiEjGakn91QwzTKi98089uPYczv6ozH0AhnLb8m6QPu6Nl QhlY3h1IkijMJVNSPHYTZKMxJB2V5bP85u6Jk+Ca2oMWM2RhXK7dMpE6OYm+TADT2yZdNq kItbppGDoeVeIqusDSZ9lnvFUi7Oerw= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=KvJVzUtz; spf=pass (imf03.hostedemail.com: domain of 3X7SmYwcKCIY9yuoopoqyyqvo.mywvsx47-wwu5kmu.y1q@flex--zokeefe.bounces.google.com designates 209.85.214.202 as permitted sender) smtp.mailfrom=3X7SmYwcKCIY9yuoopoqyyqvo.mywvsx47-wwu5kmu.y1q@flex--zokeefe.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1671869536; a=rsa-sha256; cv=none; b=78hNGZcPb8XbTUYGh7eW/W24YO3FzCwbQkWLH0lQn88cvBeYCeliM22Z+7DR9O28c3dwuP Q2jQ2wCSJDBCNLEB/OdVsUk723qOiw+koev8fWKlcQ2ktq/A25NzXo5VMAzyOXseiFVEqi 5r9pUwI6Jn8r/NpLqPA4kH+v3freOMM= Received: by mail-pl1-f202.google.com with SMTP id o5-20020a170903210500b0018f9edbba6eso4806792ple.11 for ; Sat, 24 Dec 2022 00:12:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=CLNnJd70ppyA/NycQ0X3RpMhQCUPbfoR/CeGZ2SECtc=; b=KvJVzUtz4kTrA5zSMYZsEaF5rzMBMjUkNdUpmskBApF6t9+qSaYHK35X8ZMt7v36Va TiOrYOTKQS+jN5D9i7C+qkRZ3j0fulIHp72+Uc328W/QtqOseEMLXqvEfy9tf6Jemgg9 aR2dN1MvkwRKrALxrMBa50w7wHDzxmZbGJTXfWY8snWJ7yZO81TKhba1pp1Z1GyYVyZv 25si/eIHF48gJEGGDDO/Pw24iQWOgduzQY6FY2yGjdRJvQkhz7YR3+LsX5BkI0vnYkJy 6+p6ss7S7BcyxnCJFmie3egThN21Rwq7OUUrgGzL2zr/81XUjihHEiiq8Uy1h/y6s6kt yKmA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=CLNnJd70ppyA/NycQ0X3RpMhQCUPbfoR/CeGZ2SECtc=; b=UatxHeJaOH0XvVLztxtFnrQwpA9nd50PjsoQvQnJjV4U7BG8R5woBAk8gppWu1GYHd VA5maAA+yz35jCw3TnKgpgo8jMxYV+r0UZmCUQGG4X4AtpfwCC4CZQCn0+qhdXKaEmDZ cGJL2aIv7tk2OxRCHQ+jIAd1AEtvYPjwvPrie5EoASdPAqrWqx0VW67mK9X+k6ZrecQV HLw4TK45oG/8Mq5Fb525IoaHR9xfDPYjfBUThjWTemrcFT3KiA7EGEpkFM8pNkEpfd61 X1svFA9Gr9xdYVgboOVAnb9TtzZHn0YciIetDROXgeHj0rhhZJDMLTkZONfeT84iRR8Y sdyw== X-Gm-Message-State: AFqh2kqOR+ceYibgSUDCQO3XLqK5Dm+SwUDIhRn7BMY5lwGY9XaWG0Jq gbEnFatOTwjnfD4mutrdpAmxgifpJVer4pBoFxFtll9V7g7IjjbMDIrgLlQmtgByDDPwaO4VoVw FHx2CzkPHkp3cnVqiarkd+XWgOXojxRT06lJ8DvKNXzlLM9PxbMdtOoAWlAw= X-Google-Smtp-Source: AMrXdXvlobVQR1nR7URHxL5LDAGlAhiApaUyxcnX2240pREAOBx3+kodfs1Ymkh1gJJFeZfRR/8gHJLc9Tyc X-Received: from zokeefe3.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:1b6]) (user=zokeefe job=sendgmr) by 2002:a17:902:ce06:b0:188:fd3f:cb06 with SMTP id k6-20020a170902ce0600b00188fd3fcb06mr614142plg.23.1671869535430; Sat, 24 Dec 2022 00:12:15 -0800 (PST) Date: Sat, 24 Dec 2022 00:12:02 -0800 Mime-Version: 1.0 X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20221224081203.3193960-1-zokeefe@google.com> Subject: [PATCH v2 1/2] mm/MADV_COLLAPSE: don't expand collapse when vm_end is past requested end From: "Zach O'Keefe" To: linux-mm@kvack.org Cc: stable@vger.kernel.org, linux-kernel@vger.kernel.org, Andrew Morton , Hugh Dickins , Yang Shi , "Zach O'Keefe" X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: C137C20008 X-Rspam-User: X-Stat-Signature: jnqwsroohmiwn484e7oxcny4f1j4cpqk X-HE-Tag: 1671869536-942243 X-HE-Meta: U2FsdGVkX19XYve38/EYS9MXCPCJVuCDyhAgOQCIZH/K6TpjwKTtxHiCV9rWYIb6wj1gdkTqffEd1gLRZpR9NqI0jvl7pYA4VTLPhLvgB/IUQf0C6vwEU6JqXEXO8ElTPK6zNTXkmpS9ea+BQ5qI7Jv8PT9BmO2zxfiYF5llsYNxbZSgz1coexAq6okf+3RXGwo72zkbGI6fqAE0WeL+WjdpwxNb09Wgf45qN4Gbd9fbkuNQOqi4M11yg2NNlIHQc+uGWsD2gPFzeElfGaRQtJ2XHfoCAdrqnfIWVoAKeoBKohtIN13S8A9eioCxVWmeML1+wBP8kSwqcCCW0NcDl+EgCL/xO7h/DngiLN/mOJQfm16FVLdpIhqPqAiVjJTw+NIi7axDm1SHgKn2cfR3PShG7K15VjX4JFhKQ6DVSl07uDRj4Tk/dWTT/fEMQ2CinqxLn+Aq0X+tA4LkVwgaCSJqpnHz0yd6o7YPLY1m44l/floJnYZrr0qlk3ob+mP4qEDpj8QbJ2N/FiWkb3E+XnoOjfAPNTF6yYuXc9McPioOcAHT+eJ94501pTrHCRPh+Dzln+DEzmq55sVsGz5ZX3Dgn0FkIMBsiYgsf6FXWn2lMKO28SBT3MTbGTWIhmrgefv9oxMDRyeqCrF1wJeY4vo5iRIykQeDzBLRqGaZ9Nz6ra0W8lvJQywvB8z+ZiEoJ00eT5Rh/2nM2sGda+HqBir9s3rZ+SeC3y5kI/RFQN8tJS25HeiX2h/zpBrtdzbdGjTjpQ0kjyxsbwxMYxZADQul4ckx24KGw2NEoFIVpseJhQk+RcuahbFnEBiMRoGUtXk1Wi70aupCqgdU9tD7lSrPnPpDwqRfxv2GAwHM33ey6MfUDpypv9lTdJtRUkbifozsujU23QpfKaA9rsm8vf83cA012UTPEeGBmZrP+sG1DdmnaIjmm1sZ64sfCwPKHFpsX04dh6vDS2wWlM8 T3bMFEwM Li1FyMAPDQHFiGtxvshcij4ko1uONkVU8jn56ehff1a9eagux7XNZ0dzR1ojRBySOUmxusIRYfPvMUukdOU8msAZYfzV337LR+JV4krODqnhUv8FXIj3DpHp6okR8O/rts0xWHxYyl9odga16XStpMEXs2kLHdQwDMipdB+wMgY3hcbEKMrtNeUGcecbWWLxOlh8ADxCqFgDRRY9Q5FYgWBsher48tyZwutOWSekapIT1YOeWIcb0mQM3QEaidlNT3vR3cDXVDFML64+SuRviNgrKBXNs7UGJgrfw4HyCI31PCLTNE02ui5hodcd/Mw8VqjcEv8jzVFPF0/xT2LfFYqPXjNNiL3JKnT+AfSoyICro9jPZjOMzzJv5mBtXJNTgMrYPvMMoxux/BD0tuv4wnimQd8iKKzMMnawU X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: MADV_COLLAPSE acts on one hugepage-aligned/sized region at a time, until it has collapsed all eligible memory contained within the bounds supplied by the user. At the top of each hugepage iteration we (re)lock mmap_lock and (re)validate the VMA for eligibility and update variables that might have changed while mmap_lock was dropped. One thing that might occur, is that the VMA could be resized, and as such, we refetch vma->vm_end to make sure we don't collapse past the end of the VMA's new end. However, it's possible that when refetching vma>vm_end that we expand the region acted on by MADV_COLLAPSE if vma->vm_end is greater than size+len supplied by the user. The consequence here is that we may attempt to collapse more memory than requested, possibly yielding either "too much success" or "false failure" user-visible results. An example of the former is if we MADV_COLLAPSE the first 4MiB of a 2TiB mmap()'d file, the incorrect refetch would cause the operation to block for much longer than anticipated as we attempt to collapse the entire TiB region. An example of the latter is that applying MADV_COLLPSE to a 4MiB file mapped to the start of a 6MiB VMA will successfully collapse the first 4MiB, then incorrectly attempt to collapse the last hugepage-aligned/sized region -- fail (since readahead/page cache lookup will fail) -- and report a failure to the user. Don't expand the acted-on region when refetching vma->vm_end. Fixes: 4d24de9425f7 ("mm: MADV_COLLAPSE: refetch vm_end after reacquiring mmap_lock") Reported-by: Hugh Dickins Signed-off-by: Zach O'Keefe Cc: Yang Shi --- v1->v2 : Updated changelog to make clear what user-visible issues this patch addresses, as well makes the case for backporting (Andrew Morton). While there aren't any stability risks, without this patch there exist trivial examples where MADV_COLLAPSE won't work; as such, this should be backported to stable 6.1.X to make MADV_COLLAPSE dependable in such cases. v1: https://lore.kernel.org/linux-mm/CAAa6QmRx_b2UCJWE2XZ3=3c3-_N3R4cDGX6Wm4OT7qhFC6U_SQ@mail.gmail.com/T/#m6c91da3cdbd9b1d1ebb29d415962deb158a2c658 --- mm/khugepaged.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 5cb401aa2b9d..b4d2ec0a94ed 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -2649,7 +2649,7 @@ int madvise_collapse(struct vm_area_struct *vma, struct vm_area_struct **prev, goto out_nolock; } - hend = vma->vm_end & HPAGE_PMD_MASK; + hend = min(hend, vma->vm_end & HPAGE_PMD_MASK); } mmap_assert_locked(mm); memset(cc->node_load, 0, sizeof(cc->node_load));