From patchwork Wed Sep 28 12:01:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Alistair Popple X-Patchwork-Id: 12992196 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 27C60C32771 for ; Wed, 28 Sep 2022 12:17:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3DE7B8E013D; Wed, 28 Sep 2022 08:17:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 38EBE8E0120; Wed, 28 Sep 2022 08:17:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 207F38E013D; Wed, 28 Sep 2022 08:17:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 0DA598E0120 for ; Wed, 28 Sep 2022 08:17:11 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id BFB5940691 for ; Wed, 28 Sep 2022 12:17:10 +0000 (UTC) X-FDA: 79961393820.20.93D6C18 Received: from NAM04-MW2-obe.outbound.protection.outlook.com (mail-mw2nam04on2089.outbound.protection.outlook.com [40.107.101.89]) by imf27.hostedemail.com (Postfix) with ESMTP id D47F040011 for ; Wed, 28 Sep 2022 12:17:08 +0000 (UTC) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=BBbBu95EBoYMzuEPu9FY9LT+RefTSP3O/MFt2xXEh9AAxGWWAK5L8T6NVA47pqhzNNLImEp6Sg2lispHW6xlDvaybb/TZp1aubnwoXKNEMpy6OuJlh2UK/E/NgboTqZqZnOkVzzsT+dUiQFow+TYtuQcx9lvgCRuzgDAbsap0nin0DV2QGfD1trHsCKwDA1cVHeY9KDWsBU57KvKM/22z1EtTbkFj5TIma0vJV/yZ5hh8M7BBL2y6Yn58nzjazyVmzHR+DUkVEEcD8SmL8a6S1AAekwbTxUt2CNgkyB6alDI0HHo0qDae5augJqi1FOtPiHtMOI4hDVW1HwDvwdZxg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=6M9w+2gybViDBivC9wQrqqVE6VFlLxDz9IRiM3M2IEI=; b=I1032g/tMIBPolq7HvHYbvY3bQySmxUKavoTfPIaSIAdllf42KBoKjR7opJQpny397SAjYl6VsXFnHE4AIb2s7ojJGKcmYriv0yEmWjIjh4UPCxg8lMCkfYoCF6yGieDgHpTx1hKmwpvhIKWr8zhSkps2xs+qVmcVDRQJGu7BLrs4MW+mJidXi17IWp3Vnxo4jjWoF0PMDCIppfRu4bv8caD/Y7mUJPaepMIv1r4XqTcYii93wlGfWvWyinl9RcSt2oxKUYKvHkRhjdQNLyZ6lYWb4TpdA5wf7TpX5wcbPb4IVWsUNPwl9ksdkR92ZMYCKvaxMZBWYi0jt7/pT0uHA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=6M9w+2gybViDBivC9wQrqqVE6VFlLxDz9IRiM3M2IEI=; b=fJ1okzx/HlJpu0lt3iUPIZdcA3GJClhwhqJ9Bx5TL50iDDjA0Q6Tee1jkAh59LYDhGmIgRLIdR80r1Rp17d1IFfi5VDbaUjTZ0vR4DkUUR4mTW4Mz4SW24bkaKKTKFcQafvqhVKpVQxL57+GMmBYsT1ZJQiQKdl1gRtulf66wgdYmAExwj89Sy+N9z/bRON1MuojBptWAWDZrZpasIOVDIyLTC/4i1uZtKEwEdx9hwyETtY1Zez9LPPF5ygRtd5aZnutkLUi9sUgxug6RM9lZ9ykwmjgoL4FmsnbIsP3jL9J9fI/djfNo88B7OeiCXRzDp5pY/TWJ4eQavpn4qeb2Q== Received: from BYAPR12MB3176.namprd12.prod.outlook.com (2603:10b6:a03:134::26) by DM6PR12MB4337.namprd12.prod.outlook.com (2603:10b6:5:2a9::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5676.17; Wed, 28 Sep 2022 12:01:57 +0000 Received: from BYAPR12MB3176.namprd12.prod.outlook.com ([fe80::4064:6c13:72e5:a936]) by BYAPR12MB3176.namprd12.prod.outlook.com ([fe80::4064:6c13:72e5:a936%5]) with mapi id 15.20.5654.026; Wed, 28 Sep 2022 12:01:57 +0000 From: Alistair Popple To: Andrew Morton , linux-mm@kvack.org Cc: Alistair Popple , linux-kernel@vger.kernel.org, amd-gfx@lists.freedesktop.org, nouveau@lists.freedesktop.org, dri-devel@lists.freedesktop.org, Jason Gunthorpe , Felix Kuehling , Alex Deucher , =?utf-8?q?Christian_K=C3=B6nig?= , Ben Skeggs , Lyude Paul , Ralph Campbell , Alex Sierra , John Hubbard , Dan Williams Subject: [PATCH v2 3/8] mm/memremap.c: Take a pgmap reference on page allocation Date: Wed, 28 Sep 2022 22:01:17 +1000 Message-Id: <12d155ec727935ebfbb4d639a03ab374917ea51b.1664366292.git-series.apopple@nvidia.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: References: X-ClientProxiedBy: SY6PR01CA0159.ausprd01.prod.outlook.com (2603:10c6:10:1ba::17) To BYAPR12MB3176.namprd12.prod.outlook.com (2603:10b6:a03:134::26) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: BYAPR12MB3176:EE_|DM6PR12MB4337:EE_ X-MS-Office365-Filtering-Correlation-Id: a149d2f4-4aba-41e1-5de7-08daa1493db8 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: WT7JMBoco1FW6jnV3fEIznbxXtekjxUEwNMxi8JsZ4daAWQv6z+vQDx4IWcw3j2tiXqm7ULAUVHN5f9qemOaHPt3lxciIJ14tDK8pzPrEXetsScHBByUnsEulbkqVB1NC5LjgKVGApdF7sC+zQf4QD6Lpo7Iqj/etJSRnnGvjdjO45GZSZF7R2VC1YhbZHXvDUIj7qVic32M8BEzw3Ovta1oeQfKwEIqeUqsxmpVYjhm0KBr7LbolX30ddY0Flbo8VZQtEdrRH1cUsgkDpZ3IKzL6s0q6OQi3rdhik9Vw45TkHyOBjvD+REP9JQuKPXqvgNh4sQto5gCmHOFZwcxikO2FsAniMjqoummt6/+JfB6jpwCyW16b1j0a62E8tonKFTsYk0mCcbWCmK8TeGhANrlCTlRjDYTislSCrGVjP2IMxCRgYUf0STA6/yOo/r2AdXm1Fh5jdnPIB3jLNjZkHUOc9wwiAVgMAIntMMkfKTuQ2+ujti+kgB3lQyVpkJiirMhtYlMw5rTIWmjbXN1MgsQHmanamFP/R7nuuW35xTPIdP7AZgHf0tUb0NXWxTmunHvYqijmj2gq36HGkxWlha3W3TfOa5QE/YrXDK/A4FQymP7nzQHg6K79Ob4Ur1sMmC52Wq3mpr9hOKwfcbtJ26vs2zXisFfnat9Rvjjn6JOh8ymmVkHlHRaSE44qlUD9G7+0aTnxiQkwZwz/t7bdZ8BEcv6DyBIX3ghttApmvgHmrTjZJSFG9K6oOM9shjpeLvxIx6Zl6gZXa7dj9iF/gqoAt0bFMiQSwambs+U8ns= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:BYAPR12MB3176.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(4636009)(136003)(366004)(376002)(346002)(396003)(39860400002)(451199015)(86362001)(36756003)(2906002)(2616005)(6506007)(7416002)(6666004)(6512007)(26005)(38100700002)(83380400001)(66574015)(54906003)(316002)(186003)(966005)(8936002)(5660300002)(4326008)(66556008)(8676002)(66946007)(66476007)(6486002)(478600001)(41300700001);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?q?seDNySyD8FVB+AhplP+olseXyusb?= =?utf-8?q?+s83yTD2dfpui9LDv1vONmmVfx0jl1vSjydfDC6CwjaBrDuEL5ua97vKSU9UIBprq?= =?utf-8?q?prS1CcrQhF/SeZ57KxbdGu1cqunn33LOaA+4ssxZP9C3ki14OY3/XKYoSTPALYVe0?= =?utf-8?q?V8S7cCTyjyLu4yPyfB5VnHmofSX29aP/SjLkD3NXTZJI1s2fwtfLOEjl4EZfKIzow?= =?utf-8?q?xcux6kQYeMoLddaY+eMCU5NUF3+AwaT0UqQq3KB25t54uhX1BWY3BdoCqWhhPwnI/?= =?utf-8?q?4Fh1fpOS9XAqrJTgQqQFOC42Nbd8WgAnf3M8f1jSMwGoCP9fme5BIwphHKN26hEkr?= =?utf-8?q?7ISuRZobY4sZH+GKvf3weIaZ76GzXEoWCq2KNo767l+Zf8hSD9qv1PAaE16KrZlcj?= =?utf-8?q?O3z5YYJo0YPazFyzRuSqz8O/zsy9TGoapuyBIvbrcFxDBSHADx1jLUgycruS2y9kO?= =?utf-8?q?TPh3zRzghKjGDsfzDUlko433jfCSl4up9POmMHZc24fulEOxxx1i01Ktb2QOFfbuL?= =?utf-8?q?TLDXtpP8ukFt4GH25eY0ULm15clUSFji7PwFYuhLejsPubQgDqknPONnwOtyrC/Wm?= =?utf-8?q?brNXwtmAjDUNxQmwi+6NYww7Mpia/6JWmhWxDZtw1M2h4slJPn/UCyV8wJecry8YH?= =?utf-8?q?KfZqkZTEB3HdRRY0VG/sHqfWz6iflcn9cYTG264wIyO2wYkB3lJf6vKgvl9/1iWWH?= =?utf-8?q?JIpk/7g5mE5KK8M+hc6My+qnHmVizTpX3YsmTcJj8BcZIV3Grij0Y9n+bswVZB3vX?= =?utf-8?q?w+7SOe7AGyJxbxRxbcq7IJky+Ihw4qTjjzVJoJj9soagvZxBXi/GT8vyTQSb9KolF?= =?utf-8?q?cEMSzQB5VaiM5gxU1MLQHHnhjZzEaJ1TToeLAPl2X151Z6XDdkegcJDnAYFBZoLht?= =?utf-8?q?t72r8kiq/KCPQhkPfACOi506hOOqRyiyTwA546E/UmO17LKubCVvjmFTC2UbP1Elt?= =?utf-8?q?kCfPkL6KRM993g/wJHRppUql5nH8Q0A8bWYF+PlyLwPUNcS4xOcPM8LeHM2LIBbsk?= =?utf-8?q?lJgVDw/rp/lNLFpN2L1Yv4MRAS4ENSJOXABE3HPTHiZ834VGO5I/iakUroQoshSWX?= =?utf-8?q?8NAXRVrpJUDNi2HuWJZKcjilRHMRW/PCwwCgJQB8NDXei0VtfOWn/tleTcTCqof/7?= =?utf-8?q?9HCJjl8ujwXSazNWtrzm7RmaUrtZdJb6uJAVdR7bMz6v8yMU/itNJ1r0RuEY970jA?= =?utf-8?q?TUBcrTpYzxpBGGftfKxz7wPdMkot+4jGjsSFA/C/OCG8NjHdB+sh8YeC7Mlbe9/Q8?= =?utf-8?q?jkCVzsffcqrB6EZ+0Ca2qW1OkGm58J2pS6/pXWfiqKRz5AlE32id+3309aaYrjMHk?= =?utf-8?q?JMXwAYjc8Biist8H0QDWHfFbZdT4grZuk4hRlSkGFLwYGBaWBEEYvcZzpnK5XxDj/?= =?utf-8?q?Yz/vxRF6eAhYQRGPd7bCcNzq7F6tW0NaThcMUV0pOXu9Qe9vY9+IDT38AHPsnl8nr?= =?utf-8?q?1z0nyVWVEB/+7ZWZyxjCCUf85cQr5+CbhRXe7eIjeEGOI5AZ0hMvEnSzZNn/Jl59T?= =?utf-8?q?npsJNAtNJT1s?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: a149d2f4-4aba-41e1-5de7-08daa1493db8 X-MS-Exchange-CrossTenant-AuthSource: BYAPR12MB3176.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 28 Sep 2022 12:01:57.0898 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 6jfxEc+grLbYoGv6DbvjxYDS32krdCslvxMazBUu5vyBCy6lTWOQtO6ODT7VKC2psqmf2R7Dq5IJSLFtV7MDxw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM6PR12MB4337 ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1664367429; a=rsa-sha256; cv=pass; b=VMxdzu6r6yhuj9DS3eeAv58QfQLNPvhK2Jh+zuX8IEngNFQCn81RQ9E/Qjckb7PocmKc20 UkENThbXMrmfOzobLfBIKmPCN+lj/6t5A/QYgqrIpKegArqlaD6jDmqnRq9YWj2LFRWtZK xgE27O4Mcs8spXzOexCb8Zx7m4ku6HU= ARC-Authentication-Results: i=2; imf27.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b="fJ1okzx/"; spf=pass (imf27.hostedemail.com: domain of apopple@nvidia.com designates 40.107.101.89 as permitted sender) smtp.mailfrom=apopple@nvidia.com; arc=pass ("microsoft.com:s=arcselector9901:i=1"); dmarc=pass (policy=reject) header.from=nvidia.com ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1664367429; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=6M9w+2gybViDBivC9wQrqqVE6VFlLxDz9IRiM3M2IEI=; b=XeFyPiQZ81pQbmLY7uoo462Dcy4i4G+zBB2x8+F7lvlHQ2Pi9CWpjjKzRddpf+roJkf8M+ 4NjtnJ9QiCC4Iu9TYMHaVEOXtvjQC6hSq6VBwlEMvzjkCouUd4kKuNd6JIEs6MJGWViPB+ brPi5CHpz+t2+UsNocHvKSga57Ldw3Y= X-Stat-Signature: 55d4hwf7hn9orhzfgd417yext44jkkk8 X-Rspamd-Queue-Id: D47F040011 X-Rspam-User: Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b="fJ1okzx/"; spf=pass (imf27.hostedemail.com: domain of apopple@nvidia.com designates 40.107.101.89 as permitted sender) smtp.mailfrom=apopple@nvidia.com; arc=pass ("microsoft.com:s=arcselector9901:i=1"); dmarc=pass (policy=reject) header.from=nvidia.com X-Rspamd-Server: rspam08 X-HE-Tag: 1664367428-682880 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: ZONE_DEVICE pages have a struct dev_pagemap which is allocated by a driver. When the struct page is first allocated by the kernel in memremap_pages() a reference is taken on the associated pagemap to ensure it is not freed prior to the pages being freed. Prior to 27674ef6c73f ("mm: remove the extra ZONE_DEVICE struct page refcount") pages were considered free and returned to the driver when the reference count dropped to one. However the pagemap reference was not dropped until the page reference count hit zero. This would occur as part of the final put_page() in memunmap_pages() which would wait for all pages to be freed prior to returning. When the extra refcount was removed the pagemap reference was no longer being dropped in put_page(). Instead memunmap_pages() was changed to explicitly drop the pagemap references. This means that memunmap_pages() can complete even though pages are still mapped by the kernel which can lead to kernel crashes, particularly if a driver frees the pagemap. To fix this drivers should take a pagemap reference when allocating the page. This reference can then be returned when the page is freed. Signed-off-by: Alistair Popple Fixes: 27674ef6c73f ("mm: remove the extra ZONE_DEVICE struct page refcount") Cc: Jason Gunthorpe Cc: Felix Kuehling Cc: Alex Deucher Cc: Christian König Cc: Ben Skeggs Cc: Lyude Paul Cc: Ralph Campbell Cc: Alex Sierra Cc: John Hubbard Cc: Dan Williams --- Again I expect this will conflict with Dan's series. This implements the first suggestion from Jason at https://lore.kernel.org/linux-mm/YzLy5jJOF0jdlrJK@nvidia.com/ so whatever we end up doing for DAX we should do the same here. --- mm/memremap.c | 25 +++++++++++++++++++------ 1 file changed, 19 insertions(+), 6 deletions(-) diff --git a/mm/memremap.c b/mm/memremap.c index 1c2c038..421bec3 100644 --- a/mm/memremap.c +++ b/mm/memremap.c @@ -138,8 +138,11 @@ void memunmap_pages(struct dev_pagemap *pgmap) int i; percpu_ref_kill(&pgmap->ref); - for (i = 0; i < pgmap->nr_range; i++) - percpu_ref_put_many(&pgmap->ref, pfn_len(pgmap, i)); + if (pgmap->type != MEMORY_DEVICE_PRIVATE && + pgmap->type != MEMORY_DEVICE_COHERENT) + for (i = 0; i < pgmap->nr_range; i++) + percpu_ref_put_many(&pgmap->ref, pfn_len(pgmap, i)); + wait_for_completion(&pgmap->done); for (i = 0; i < pgmap->nr_range; i++) @@ -264,7 +267,9 @@ static int pagemap_range(struct dev_pagemap *pgmap, struct mhp_params *params, memmap_init_zone_device(&NODE_DATA(nid)->node_zones[ZONE_DEVICE], PHYS_PFN(range->start), PHYS_PFN(range_len(range)), pgmap); - percpu_ref_get_many(&pgmap->ref, pfn_len(pgmap, range_id)); + if (pgmap->type != MEMORY_DEVICE_PRIVATE && + pgmap->type != MEMORY_DEVICE_COHERENT) + percpu_ref_get_many(&pgmap->ref, pfn_len(pgmap, range_id)); return 0; err_add_memory: @@ -502,16 +507,24 @@ void free_zone_device_page(struct page *page) page->mapping = NULL; page->pgmap->ops->page_free(page); - /* - * Reset the page count to 1 to prepare for handing out the page again. - */ if (page->pgmap->type != MEMORY_DEVICE_PRIVATE && page->pgmap->type != MEMORY_DEVICE_COHERENT) + /* + * Reset the page count to 1 to prepare for handing out the page + * again. + */ set_page_count(page, 1); + else + put_dev_pagemap(page->pgmap); } void zone_device_page_init(struct page *page) { + /* + * Drivers shouldn't be allocating pages after calling + * memunmap_pages(). + */ + WARN_ON_ONCE(!percpu_ref_tryget_live(&page->pgmap->ref)); set_page_count(page, 1); lock_page(page); }