From patchwork Thu Feb 16 08:36:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 13142717 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0389AC636CC for ; Thu, 16 Feb 2023 08:36:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 395C96B0071; Thu, 16 Feb 2023 03:36:08 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 346CC6B0072; Thu, 16 Feb 2023 03:36:08 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2343E6B0073; Thu, 16 Feb 2023 03:36:08 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 1481B6B0071 for ; Thu, 16 Feb 2023 03:36:08 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id D71A8A1183 for ; Thu, 16 Feb 2023 08:36:07 +0000 (UTC) X-FDA: 80472497574.20.95F6470 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by imf01.hostedemail.com (Postfix) with ESMTP id 110E640004 for ; Thu, 16 Feb 2023 08:36:04 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b="F/UOESeP"; spf=pass (imf01.hostedemail.com: domain of dan.j.williams@intel.com designates 192.55.52.93 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676536565; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=VHGsyA5AQp3dkl3/OFNwoHtagseRdkQGpProFhhnwkM=; b=1j3ga+DL/2gzUxWDJNrAUggF+f7wXD4LaifpA4UCzbLptBgpgIdeuOzY6oj0fMs4XiNjOX j0RZ7ytJMUYPROguYp3SxBCoqRnM4ZrwdMGPa9Qi67syB7v6+j/o923qG182CkV4kE9swL arI/7vSALJ3TPlo5WqlH/MaBu+gWvWI= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b="F/UOESeP"; spf=pass (imf01.hostedemail.com: domain of dan.j.williams@intel.com designates 192.55.52.93 as permitted sender) smtp.mailfrom=dan.j.williams@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676536565; a=rsa-sha256; cv=none; b=KwvBA+JadEfzbERNJgfUE1LybOB43pF1sDgtdmNRrPcbYTSF3nkOfsFmsa3dJBzVw3EKTk 7zJmVt6L/P7haHUWvfslx5fCIbHx17nyt1DHRMXtBGIvKc2HeW5LcHHg3KIWBY90oUrBDo Z1H1PypXceZPUVgLx1vPku8aLRQplhg= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1676536565; x=1708072565; h=subject:from:to:cc:date:message-id:mime-version: content-transfer-encoding; bh=z3HOf/V6AqUmGiXNGqZnhtXn++kw0UgCfhTsbKX5VlI=; b=F/UOESePWbQUqJxuHyL0yeOaqLFYfq9cXcnGh0KKOqA0sLPiH2mqahDD OXBa7kleXti6H4AiQeXP0BjNEO5pa83ysGpyxj6wcSuFZHgKVMjBUdceD qHoVL8qend4+2WkR8E896Z+b6kNzqzB0mU5KPMOG/M9zaOFikIOeWHyNf t3RP3xYheiaBbMaAefsYnGg1aPZV3iXNe44evRK9wtxU93EJse0u2n6XE 6VFvixPrUkyvllBFS3M+0BlQcDrpx/utmVxc0ufDqA7vV1iTMDALETpOj nogGV0bHhP50up4bvtJ1Iu9rsE706E28qKaMWErpkJhmeE+tStA/pojJK w==; X-IronPort-AV: E=McAfee;i="6500,9779,10622"; a="329381553" X-IronPort-AV: E=Sophos;i="5.97,301,1669104000"; d="scan'208";a="329381553" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Feb 2023 00:36:03 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10622"; a="793947757" X-IronPort-AV: E=Sophos;i="5.97,301,1669104000"; d="scan'208";a="793947757" Received: from smckenne-mobl1.amr.corp.intel.com (HELO dwillia2-xfh.jf.intel.com) ([10.209.112.238]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Feb 2023 00:36:02 -0800 Subject: [PATCH] dax/kmem: Fix leak of memory-hotplug resources From: Dan Williams To: linux-cxl@vger.kernel.org Cc: stable@vger.kernel.org, Oscar Salvador , David Hildenbrand , Pavel Tatashin , linux-mm@kvack.org Date: Thu, 16 Feb 2023 00:36:02 -0800 Message-ID: <167653656244.3147810.5705900882794040229.stgit@dwillia2-xfh.jf.intel.com> User-Agent: StGit/0.18-3-g996c MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: 1wb45ah5bq9n7u5e3wdd8o1n75ak46gs X-Rspamd-Queue-Id: 110E640004 X-HE-Tag: 1676536564-623964 X-HE-Meta: U2FsdGVkX18Hw4pEmaOzW9BCWmQendFhJjkI/gOGiuszMxVNS7i71dW0jceYqwP0DV8JNOjzeAOZnY0Amhpe/l3lVFXM69tlUyYnvHBJQAzCaJTdvuK1F7GwqHqj4f8uriAIduDeBr0ejLTF3CwxiXAgtCOaNv/h1gOSKUOG6rHlCdCgTNvyeq/q+OGdbZeNq56gYJx+ndEMJ1WP8nXPDvT65YPMl6qTeQgEfCuekKHYoLaF0YGJCi7NVu4VHJm22CVpKzMuiVbmIwTkhZJzMEP0Uq5KmW+w+v6VkIlk8xQxO0D+XxqW4AmjE3xovaXuaYOxU1NN8Gme71Sqd+is17VJPfQQl9+bqKsKKAVDm6SvUc85p8w2obeYFpKdkMJhI9dmjUU/PrMDRbZr99aXhppoe2kVdrz0r0xJLsmIV6NK9qMxg/VP2XugZXMDjLy2QoPRi3xhvAocgnkw9o/TZCSsQtnrPjyyYXBX6NORKSv0s/Fkm2xV8h3j0e+1Pi5en7iblpZ9/zlzsvJhVtbnOfb+sqV1Bhex8Fmy6I5w6CwAlUSWuhSHS+2w6cpG6ZiccRkcrCaJdmZa82fUMPOyq9npkCwe5yjXk51WCN5LnCD9POA1NgULjCtKE5uq0cTQQaszEMx0ClSm1q2MP0/jfXDeh5F4qJPCKHXNw7IFDYbOw9bC1m4+5Ve1qdfUhj+9rkMPqkdGas47zk6MXTGiiW6XQST36N3Ao4O8q/HWohkdB9Z9zKyw20unvuOc46CGeVa1+7oxUCsAmT3Mk+4330V3oIlNInyX+HlCSpNdIAX+tqInkj9otkzUhpo6hZuE+++2JjJfl1EzrGnGfWPWBE4sZ8XUzMmxxIPd78jJaZXjs1rgl+mOn55pDJW1z97sGEVwzBB6v6DJvVlAqOCSN+tJ5gUszLDve0AdEe1vRQdYGcmlMmL83UJsJShK+WQNKpmmj7eo74xVYV9hNwh orW3lB1w aVjlKg5lQUWMx6Xcb+zepJhAN8kMqzbSLSysQY1w9lVUsd9weseN8a22BzGq08BiWqgNcP98PLuqMwMWmKEz+7IbaaOTs4JEoTVFSZrFt9+ryRHMHje7xU9EbjtvI59JUisUfyhq6guNb7O7BAIaG8Im/PouCI4QTxHrmTAFFmdK1DNaOuMaXY0jlyGWMZXdAAO4Nl8T/lHUqe1GZjbQyIcLTLDPA+H+Sm5xNyrfkr77nkbwjsF8nzwVeKIVbqd6hYzMluqqIBHicfoogzU8y2reVFNjqyfbHVfRY+uhBJ37s5Uo8rqM6szk/8ep2KOIbczX/vDBxrHBAiO8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: While experimenting with CXL region removal the following corruption of /proc/iomem appeared. Before: f010000000-f04fffffff : CXL Window 0 f010000000-f02fffffff : region4 f010000000-f02fffffff : dax4.0 f010000000-f02fffffff : System RAM (kmem) After (modprobe -r cxl_test): f010000000-f02fffffff : **redacted binary garbage** f010000000-f02fffffff : System RAM (kmem) ...and testing further the same is visible with persistent memory assigned to kmem: Before: 480000000-243fffffff : Persistent Memory 480000000-57e1fffff : namespace3.0 580000000-243fffffff : dax3.0 580000000-243fffffff : System RAM (kmem) After (ndctl disable-region all): 480000000-243fffffff : Persistent Memory 580000000-243fffffff : ***redacted binary garbage*** 580000000-243fffffff : System RAM (kmem) The corrupted data is from a use-after-free of the "dax4.0" and "dax3.0" resources, and it also shows that the "System RAM (kmem)" resource is not being removed. The bug does not appear after "modprobe -r kmem", it requires the parent of "dax4.0" and "dax3.0" to be removed which re-parents the leaked "System RAM (kmem)" instances. Those in turn reference the freed resource as a parent. First up for the fix is release_mem_region_adjustable() needs to reliably delete the resource inserted by add_memory_driver_managed(). That is thwarted by a check for IORESOURCE_SYSRAM that predates the dax/kmem driver, from commit: 65c78784135f ("kernel, resource: check for IORESOURCE_SYSRAM in release_mem_region_adjustable") That appears to be working around the behavior of HMM's "MEMORY_DEVICE_PUBLIC" facility that has since been deleted. With that check removed the "System RAM (kmem)" resource gets removed, but corruption still occurs occasionally because the "dax" resource is not reliably removed. The dax range information is freed before the device is unregistered, so the driver can not reliably recall (another use after free) what it is meant to release. Lastly if that use after free got lucky, the driver was covering up the leak of "System RAM (kmem)" due to its use of release_resource() which detaches, but does not free, child resources. The switch to remove_resource() forces remove_memory() to be responsible for the deletion of the resource added by add_memory_driver_managed(). Fixes: c2f3011ee697 ("device-dax: add an allocation interface for device-dax instances") Cc: Cc: Oscar Salvador Cc: David Hildenbrand Cc: Pavel Tatashin Signed-off-by: Dan Williams Reviewed-by: Dave Jiang Reviewed-by: Pasha Tatashin Reviewed-by: Vishal Verma Reviewed-by: Alistair Popple --- drivers/dax/bus.c | 2 +- drivers/dax/kmem.c | 4 ++-- kernel/resource.c | 14 -------------- 3 files changed, 3 insertions(+), 17 deletions(-) diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c index 012d576004e9..67a64f4c472d 100644 --- a/drivers/dax/bus.c +++ b/drivers/dax/bus.c @@ -441,8 +441,8 @@ static void unregister_dev_dax(void *dev) dev_dbg(dev, "%s\n", __func__); kill_dev_dax(dev_dax); - free_dev_dax_ranges(dev_dax); device_del(dev); + free_dev_dax_ranges(dev_dax); put_device(dev); } diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c index 918d01d3fbaa..7b36db6f1cbd 100644 --- a/drivers/dax/kmem.c +++ b/drivers/dax/kmem.c @@ -146,7 +146,7 @@ static int dev_dax_kmem_probe(struct dev_dax *dev_dax) if (rc) { dev_warn(dev, "mapping%d: %#llx-%#llx memory add failed\n", i, range.start, range.end); - release_resource(res); + remove_resource(res); kfree(res); data->res[i] = NULL; if (mapped) @@ -195,7 +195,7 @@ static void dev_dax_kmem_remove(struct dev_dax *dev_dax) rc = remove_memory(range.start, range_len(&range)); if (rc == 0) { - release_resource(data->res[i]); + remove_resource(data->res[i]); kfree(data->res[i]); data->res[i] = NULL; success++; diff --git a/kernel/resource.c b/kernel/resource.c index ddbbacb9fb50..b1763b2fd7ef 100644 --- a/kernel/resource.c +++ b/kernel/resource.c @@ -1343,20 +1343,6 @@ void release_mem_region_adjustable(resource_size_t start, resource_size_t size) continue; } - /* - * All memory regions added from memory-hotplug path have the - * flag IORESOURCE_SYSTEM_RAM. If the resource does not have - * this flag, we know that we are dealing with a resource coming - * from HMM/devm. HMM/devm use another mechanism to add/release - * a resource. This goes via devm_request_mem_region and - * devm_release_mem_region. - * HMM/devm take care to release their resources when they want, - * so if we are dealing with them, let us just back off here. - */ - if (!(res->flags & IORESOURCE_SYSRAM)) { - break; - } - if (!(res->flags & IORESOURCE_MEM)) break;