From patchwork Mon Feb 6 07:47:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alistair Popple X-Patchwork-Id: 13129430 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BFA1CC61DA4 for ; Mon, 6 Feb 2023 07:48:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 357A66B0073; Mon, 6 Feb 2023 02:48:26 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2E0216B0074; Mon, 6 Feb 2023 02:48:26 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 132096B0075; Mon, 6 Feb 2023 02:48:26 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 039BC6B0073 for ; Mon, 6 Feb 2023 02:48:26 -0500 (EST) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id C58E9A0B88 for ; Mon, 6 Feb 2023 07:48:25 +0000 (UTC) X-FDA: 80436089370.08.9BB986D Received: from NAM11-CO1-obe.outbound.protection.outlook.com (mail-co1nam11on2048.outbound.protection.outlook.com [40.107.220.48]) by imf07.hostedemail.com (Postfix) with ESMTP id E293840006 for ; Mon, 6 Feb 2023 07:48:22 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=A+FFKqJY; spf=pass (imf07.hostedemail.com: domain of apopple@nvidia.com designates 40.107.220.48 as permitted sender) smtp.mailfrom=apopple@nvidia.com; arc=pass ("microsoft.com:s=arcselector9901:i=1"); dmarc=pass (policy=reject) header.from=nvidia.com ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1675669703; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=xC0ascId9hC0FQ0mqA9gGlsD4Tec21nIaiiKW1ANuos=; b=TLrWXaGb2RIrpnZhQaB5GOh+bt/R99dKsNRd/YhObQMQBladJyb413/FTL/wtv4DRtypAt yufiGXxohD2SLO5UWEqQa3cNyWProh8kbqfPvq15tcUcATOps8hgVeuRhPSNH60v54BY1d qf2MpgIfEXXldJCH1SgiM5Vfa04410s= ARC-Authentication-Results: i=2; imf07.hostedemail.com; dkim=pass header.d=Nvidia.com header.s=selector2 header.b=A+FFKqJY; spf=pass (imf07.hostedemail.com: domain of apopple@nvidia.com designates 40.107.220.48 as permitted sender) smtp.mailfrom=apopple@nvidia.com; arc=pass ("microsoft.com:s=arcselector9901:i=1"); dmarc=pass (policy=reject) header.from=nvidia.com ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1675669703; a=rsa-sha256; cv=pass; b=bgGQnXhBtJuEJWkUl3OpdhSUBmhcHBcn+optZFiFSQ08rRMovQGHTdXYaxyltVmvyqEE2N x73c56LqG7cTTNqFdy5KWR8idvM8p+j4RmE59Y5EZFDIXyM6bzo9AI5B4x4spkasPkfhyX LeSoaFK9m7x2EUVBEgX1IF3o6dPaaZg= ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=aLsp8OkfnvgUX+LaCm1Dc+hdKUsiL3axpftCKUHDm1OQkZpSVWKd41QJpSxq9ko55snTctoS/hMmNjRr+itI3WbB3Q3AzmA6538HBTiXXlDO0S3AaJVzGqI3eu5++vIHZ28QtTlNa+k+dSpLAd+wLCU2EFbS1oHBSxKOk5zueaVmKcpYV50EsgXkylZrJGkkEePSOvrV9I/Bsj2oa8vxDvjsc+kGnGm5e5xycHnK4lkQOFyzYKvS14dWfW/63g0BN0S08F1XDTOjdIwd3UKL6vwTI7+T6eKerf0GqbZqLxKFY5wI1baz3a/2JfDv/WRas+jgQG8iAAdpSy/yR16sQg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=xC0ascId9hC0FQ0mqA9gGlsD4Tec21nIaiiKW1ANuos=; b=e2vefkFqyA3sW3pHTN46sHlZlBtRrXMyAf4In3Tgp7gdMSg10ziiTG5iwAIpdN+Nv9Oog6rQy/dMDY2dO6w3grxgRGp9gRCGkURF3SWuxdZ5pDPWPjWfF80aA0GfSO+a7RMpi+RETUh04airuQKTuoiT7NdVEbit7kL7o+ESiApp+a+LNO8scmfZwfOWs/wTgyVxb6qqRBkFNcpqOIFrxMM0d+LT8KWcaV3OeF5iSjB1ePBi+f8eg+llIHnurirOEwoKyueOezpUAMsA1l275ngrS9mwcE1qemLfX/zE+eYf570QvCXAfD87djcKO8wu77Flv39WV5CzVvqq2LSVCQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=xC0ascId9hC0FQ0mqA9gGlsD4Tec21nIaiiKW1ANuos=; b=A+FFKqJYzlIAVESoxZN19dbMh3tOUmAXg71BZGbiuV4HgT2WwNhXvo9H+EqVd1hYY7W1iy9x2eJ4H3XwaGr2jO5d9GIETU1TZlbpbaiAHeuE5em9pnJ4MNG1/6OKDbNRjYa1RYLSWzUTRajUam0hS8av8ruJv6K2TuFY+X4QQbLJGSDzZBmJnFND8gFdvARudQbQbxKxMrYh7Rhm7JbL033fzE2OtsmZD+IYJ1sO5/fcSIk019OYzNFb21GL+KvbDlEwCq+4mkJl2w71KvdFVvoO4dgzK/vrr1yASUdmbDEzRitZTTs59gTXhfNc79Z05lPOjSUH1lVBxsi65AXCxA== Received: from BYAPR12MB3176.namprd12.prod.outlook.com (2603:10b6:a03:134::26) by CY8PR12MB7097.namprd12.prod.outlook.com (2603:10b6:930:63::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6064.34; Mon, 6 Feb 2023 07:48:21 +0000 Received: from BYAPR12MB3176.namprd12.prod.outlook.com ([fe80::4bd4:de67:b676:67df]) by BYAPR12MB3176.namprd12.prod.outlook.com ([fe80::4bd4:de67:b676:67df%6]) with mapi id 15.20.6064.032; Mon, 6 Feb 2023 07:48:21 +0000 From: Alistair Popple To: linux-mm@kvack.org, cgroups@vger.kernel.org Cc: linux-kernel@vger.kernel.org, jgg@nvidia.com, jhubbard@nvidia.com, tjmercier@google.com, hannes@cmpxchg.org, surenb@google.com, mkoutny@suse.com, daniel@ffwll.ch, "Daniel P . Berrange" , Alex Williamson , Alistair Popple , linuxppc-dev@lists.ozlabs.org, linux-fpga@vger.kernel.org, linux-rdma@vger.kernel.org, virtualization@lists.linux-foundation.org, kvm@vger.kernel.org, netdev@vger.kernel.org, io-uring@vger.kernel.org, bpf@vger.kernel.org, rds-devel@oss.oracle.com, linux-kselftest@vger.kernel.org Subject: [PATCH 01/19] mm: Introduce vm_account Date: Mon, 6 Feb 2023 18:47:38 +1100 Message-Id: X-Mailer: git-send-email 2.39.0 In-Reply-To: References: X-ClientProxiedBy: SY6PR01CA0010.ausprd01.prod.outlook.com (2603:10c6:10:e8::15) To BYAPR12MB3176.namprd12.prod.outlook.com (2603:10b6:a03:134::26) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: BYAPR12MB3176:EE_|CY8PR12MB7097:EE_ X-MS-Office365-Filtering-Correlation-Id: f1f32d26-9f86-4df0-cce9-08db081684ae X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: d9R20AXDYdOn6o7fqjRKlbn88gfGEhrLB0K3l/C7lo/hVvC76CZp4upyM7QvG93ysjuMVTpq/EYzK3rKe+gwBlyHxY43+Hqd/7szzD+TETPh4YIz49dsl8Pco7UcHmIYR1jDb462M2A51jpFSyZ6+sGx3FpEttvQxlIDeCWQn6283qmfDF4QeL0MGLkLR28DA19JymNK1RtyPkmfKZ09MLSCaFGJ3h12WGdvnJ5w7VGFIKFvz5stxIwwdIt0+gYERQCSlt2L26GRq9KV2GcYX8YE/OcLGZSrecKGGipuwqQm8Z7Ynh+TybkKzd+Y+ZdZKAXFYvSW5vAWUTT6n5GHEu+1RV85ersB37hzKHCRBbBjQk2dt2lN4HS7GS390bwhNfyVDFRZjE/rWx/Sf6l81vQ0i636fN5go+YWW9m/Cs2GeJ2kUXo2o5oo3aZ6hLhWLaDc2cMg8f+dO2hQNOm31Wmf2fNIgwK112xrMKjJwP7uMzzPMRVJHnTwtouclvVkulpeSj9ZswUbyFtJMziahx/BcMaIlFXt789djlRAEZlvA5Dolz1bNDtuZdc6P5ZWA0RBVH2YzDo/Bl5+FviW2ege3AA0jp0C20IRSY3uXiO/9yQxMIEjpPmWqNjE32CPYcudf9GcWGjjbN/MTX2aSQ== X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:BYAPR12MB3176.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230025)(4636009)(366004)(39860400002)(396003)(136003)(346002)(376002)(451199018)(38100700002)(36756003)(6506007)(6666004)(6512007)(26005)(186003)(478600001)(86362001)(6486002)(8676002)(66476007)(83380400001)(66946007)(2616005)(66556008)(4326008)(316002)(54906003)(7416002)(2906002)(8936002)(5660300002)(41300700001);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: +Q21fH/s4Vq+Fkqt0pEjMYKv/M0hPKFwEmzpwjecwZ8dEKYu6PTd6B1USsF/VIewWV1Iy7hIjEqgKdvNy9XCHb6UCEjDlG2t6luAcyANcpffEfiNXj358C55PHbMpnatLbnb82LSxL2sxcLHQRbRVzZM4h6g/3499Op4qz4eIG6rIq3sP9GRrHx6Tqds5OHj2oy53ycVE3dECU0kComR0bp0U7UTdqpK3MVIK9uKGtcSo4EkCbqps8ws+uJxEI5RpRZ5p7iBSlY1ReOPfXul3U6G9MPEBkweNR5Wo4HcAc+BjX1x0G6BMm8BJAXdvBORkAl4CFmowC8Z7E5ZKBPUT5ioGLu+OjJVqhEa8hveZC2M2WJRwoyMFTyjy2k0f0mQRnvj+PbEIX9nx0SvfTjUFABqKwU2Z/1Fm7Xv9WusQn3HTFnsyEhVU+QqZxtf77029QaAn7uZJvEiblZXE7TUUdcHfTdyjgSVOCgszweWtFLTMBCKcBoMx6yuvAiU7de3NbVmKBBV8LRfJqjM7RFqUHnRfyvtK1pKsfXe4zb2OgYq63aCbk2+0hF6r+nw/5hVzf3oLt+yiapdrUwdK204DQoAk7KTXc0CUaKI+YNGo7bKPIG5cPGui7kSYT8ZHQawvSD5trq1ErBFkeBBXADoptjHtDfi3Pg7Ff9bjSqv6u42noCshGmM1eBj496JFjjjXcJhMESM0UuKS0F2TKDR4mzK8GH/muRsjSoL0JfMn014g6VRrKHh+a22/xsmKrh3u4HljrqaMeDB4He5JHwM8s6+kyeU3CkKgOVeyN4xgfwHQ6JVUOEPzp4X3pKkge5V1WIX3TkAs/DwB8n2QfUb6ruFRychXQVgfdrAn7N5YcN5N2pPqmatRWTthyyJoTWcZlP35J8yKRhYIQwQbIQ2MaYqf5KLWk134Bzhxzo196snfUec2RW4rFit/7Kw9A1emZZT6nrbdKxVMb2x4rLRIxM6jc3snl4WQrocd8r+Jex/vzsh4zSwhhQK/ZeitOfxrewOqeLpG4hAG6rV+1VYTwlTQSKpvx/dthFLHcaIWOWtpB7h22PdCdxl4ctObXm+w4mkFr6WPaoITf8lxJbTwuYTIvy/Mq7HbwmfN07KQI8tkrtcRQ3iewm7r0HFtWQlYXGN5XBXsek/Aegzb5gF66fpZ5YIsbXbHelBwOnZo3v5sLifuZO8AKQCMFaIOK1Twc761o7A+q72PinuISoun30BXs8bsrh8ysRQpqLPQZ+xf48dz4zs6W4s9UdVoaLqG6poioJSXQUKEZX9xL5SLtn6VQrm9k6GtYtOTt9BqPFEyEwU4aQb7bAr9tjtFNFQyul65a2UajRWmTk8oOjmPQjG+0Na8AIoKRBXDTYolD25NtClc2GWUtxdF/r0g6WMambOrmoQN6PDvnNKiH2Y3LfxvCe/hoG96VD9Gr2SfaI7ut/DvIDGvS6cWJREA9GsI7fQpkHJD5us1nkC88mmR73jv99T+Zys5Jd8GStZTQDKhn9QFylG+VJvDSWAbh1nlVL/o3r8ujdUkBp+ttAnY5R83qnC7BMvidrwhW6C5R1H2kqzZKGGZ+24X44EGeMu X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: f1f32d26-9f86-4df0-cce9-08db081684ae X-MS-Exchange-CrossTenant-AuthSource: BYAPR12MB3176.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 06 Feb 2023 07:48:21.2970 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: VKLsBrolhjVwe4eRftYA1HkxxQ/SlJh07YGyjddLzMudWGCcOJdVx+UWim/wuIqJ+Bp6tmOMG+wJxI7NtMvsPg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY8PR12MB7097 X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: E293840006 X-Stat-Signature: 7m5h5oxaee9jhjogup17b71yrgrjqw5b X-HE-Tag: 1675669702-195609 X-HE-Meta: U2FsdGVkX19T6qwuCb/c/cYqJYFkwofXWVzMqD7CfJyep5tHs8RS2Vv5Zzw0kIS9LqxcmdmFuR3wFluop45yYOJgHmdmHU2eSON/UWGm4Hv99wZ7l+dGf2s6LU38NYwElpx7CRaOJe/ct4gWvwSQTJZkWfsD6Ymp+LXIeg3YxAs46drrnU0HhZDENdhecu5aE69RQpu9hP/ZqPBYR9/BtXIu0x7HIHI8FV6rvGdMKNPxVNM9SMwtWRQ1+T1oysbrc6rDEMiBuu0zW2mx9LOxOV0Tu9xvVzdjQIkM+y9w+PkNVsCTKXguXjBSGkODLmCZcJZL1NtZ6csO8a90vt3T5hmnVx0R7IX8MhnnwzWrqWmtAx+fHTBCDh5It40qbsm3212hlutlS07GNwidtJDDwtHCBm3+dhbtHKYMqj8sGWrMYD3iNn8G7WtDxiVF9wwE0iWa8+bAdMvPF8/kZ115xGrwvexKAyXxRhWei4EfVI9brcLFPe1HcBiBNZy0+QqpWbck7urvVjfnD49WA7fRD5CTc/LmXpHg5KcnKM2cK1qfd/pg0F/x+/5//+1n2PIM+HMHhX9jF/2Zw4EWT1aSXLqXLuEK9e7XLmpzggm2gubtoJjnmAH/VLtTB/pYR3R7gAw58ocL7DErOUL4cQCWD5pVqo7HHLK27OKM3EnyVn5PgDJcTEksnL6MHZeUwOkOyb5C+OU4f9zs920yKC2MHBBD2lnklMVXnOV+uodTXieXacDDr4r3U+t5dZe++vhOc3kYYUwQOSqQHC+A8kHvNQHtXQVtT2dXiRQ1FLEX53ubBXZxf+6Ljas64E7NoypSKZnS6/EeK1cJplvU2G+PI4oobaZ3omPYaRr3GxxWcMCvey/7011Sj4ub1JLrU0cgZP3efiCDew5wfq+SMeXm+Hyyjan6UTq0f9O88etwbnbHLg1ifKL9J5aQ4cp5ruj6OP4iySpQ0hQ4zZWCB9B iSTnxR7u ky2LgJ9GPnDTMbHddLdT45stmMAPNjHnoQjayC+wBzBMTMli5goiGkDgCmxn708nWcb3iw6T7M3TKOxA9qu+DiZ7RpB5FwRUaQI+PocBedQ8COj7rwtKC+hdbecP8jHYkIasvnvvq6ZgyUIZFz0AcPH8XqGF1NAw17fBHEiVeHPY1buXF7Nv5F0i0QJsAYKE2jKZCi7fFl+MMHXPVYbCQ2iDo2kFdbreLqKlnkAAVKsYuNoRuzSm57+oWTbQA2b0oOFIG8iPJNq82YsIt2Ymff5P/u+d51Xidtx6bOZnEd6lT3nFEplgAU1FhNzGtvjXtBC0SZV3TdNwUuVZ9MwP4M4/5jMXeqn7a1LwfWTZ/6r+3OBjOi65K41XGO6H/xDrZlOc8ZnEQmGTBP7cf+gVZJmXRO7evXitCxxmw7/18vuCDZqMgZ5+wjNesLeYl2ZayCBKYXazS/mVupLUU8u56NmiHYUt2BgYaXKXFE+zkhQoEUkGDzegdRqxTrycmVNRlooiiuTCqpcUNFFOCIL00dGavg76YeTuho7XncLtiP6vJUOw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Kernel drivers that pin pages should account these pages against either user->locked_vm and/or mm->pinned_vm and fail the pinning if RLIMIT_MEMLOCK is exceeded and CAP_IPC_LOCK isn't held. Currently drivers open-code this accounting and use various methods to update the atomic variables and check against the limits leading to various bugs and inconsistencies. To fix this introduce a standard interface for charging pinned and locked memory. As this involves taking references on kernel objects such as mm_struct or user_struct we introduce a new vm_account struct to hold these references. Several helper functions are then introduced to grab references and check limits. As the way these limits are charged and enforced is visible to userspace we need to be careful not to break existing applications by charging to different counters. As a result the vm_account functions support accounting to different counters as required. A future change will extend this to also account against a cgroup for pinned pages. Signed-off-by: Alistair Popple Cc: linux-kernel@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-fpga@vger.kernel.org Cc: linux-rdma@vger.kernel.org Cc: virtualization@lists.linux-foundation.org Cc: kvm@vger.kernel.org Cc: netdev@vger.kernel.org Cc: cgroups@vger.kernel.org Cc: io-uring@vger.kernel.org Cc: linux-mm@kvack.org Cc: bpf@vger.kernel.org Cc: rds-devel@oss.oracle.com Cc: linux-kselftest@vger.kernel.org --- include/linux/vm_account.h | 56 +++++++++++++++++- mm/util.c | 127 ++++++++++++++++++++++++++++++++++++++- 2 files changed, 183 insertions(+) create mode 100644 include/linux/vm_account.h diff --git a/include/linux/vm_account.h b/include/linux/vm_account.h new file mode 100644 index 0000000..b4b2e90 --- /dev/null +++ b/include/linux/vm_account.h @@ -0,0 +1,56 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_VM_ACCOUNT_H +#define _LINUX_VM_ACCOUNT_H + +/** + * enum vm_account_flags - Determine how pinned/locked memory is accounted. + * @VM_ACCOUNT_TASK: Account pinned memory to mm->pinned_vm. + * @VM_ACCOUNT_BYPASS: Don't enforce rlimit on any charges. + * @VM_ACCOUNT_USER: Account locked memory to user->locked_vm. + * + * Determines which statistic pinned/locked memory is accounted + * against. All limits will be enforced against RLIMIT_MEMLOCK and the + * pins cgroup if CONFIG_CGROUP_PINS is enabled. + * + * New drivers should use VM_ACCOUNT_USER. VM_ACCOUNT_TASK is used by + * pre-existing drivers to maintain existing accounting against + * mm->pinned_mm rather than user->locked_mm. + * + * VM_ACCOUNT_BYPASS may also be specified to bypass rlimit + * checks. Typically this is used to cache CAP_IPC_LOCK from when a + * driver is first initialised. Note that this does not bypass cgroup + * limit checks. + */ +enum vm_account_flags { + VM_ACCOUNT_USER = 0, + VM_ACCOUNT_BYPASS = 1, + VM_ACCOUNT_TASK = 1, +}; + +struct vm_account { + struct task_struct *task; + struct mm_struct *mm; + struct user_struct *user; + enum vm_account_flags flags; +}; + +void vm_account_init(struct vm_account *vm_account, struct task_struct *task, + struct user_struct *user, enum vm_account_flags flags); + +/** + * vm_account_init_current - Initialise a new struct vm_account. + * @vm_account: pointer to uninitialised vm_account. + * + * Helper to initialise a vm_account for the common case of charging + * with VM_ACCOUNT_TASK against current. + */ +static inline void vm_account_init_current(struct vm_account *vm_account) +{ + vm_account_init(vm_account, current, NULL, VM_ACCOUNT_TASK); +} + +void vm_account_release(struct vm_account *vm_account); +int vm_account_pinned(struct vm_account *vm_account, unsigned long npages); +void vm_unaccount_pinned(struct vm_account *vm_account, unsigned long npages); + +#endif /* _LINUX_VM_ACCOUNT_H */ diff --git a/mm/util.c b/mm/util.c index b56c92f..d8c19f8 100644 --- a/mm/util.c +++ b/mm/util.c @@ -23,6 +23,7 @@ #include #include #include +#include #include @@ -431,6 +432,132 @@ void arch_pick_mmap_layout(struct mm_struct *mm, struct rlimit *rlim_stack) #endif /** + * vm_account_init - Initialise a new struct vm_account. + * @vm_account: pointer to uninitialised vm_account. + * @task: task to charge against. + * @user: user to charge against. Must be non-NULL for VM_ACCOUNT_USER. + * @flags: flags to use when charging to vm_account. + * + * Initialise a new uninitialised struct vm_account. Takes references + * on the task/mm/user/cgroup as required although callers must ensure + * any references passed in remain valid for the duration of this + * call. + */ +void vm_account_init(struct vm_account *vm_account, struct task_struct *task, + struct user_struct *user, enum vm_account_flags flags) +{ + vm_account->task = get_task_struct(task); + + if (flags & VM_ACCOUNT_USER) + vm_account->user = get_uid(user); + + mmgrab(task->mm); + vm_account->mm = task->mm; + vm_account->flags = flags; +} +EXPORT_SYMBOL_GPL(vm_account_init); + +/** + * vm_account_release - Initialise a new struct vm_account. + * @vm_account: pointer to initialised vm_account. + * + * Drop any object references obtained by vm_account_init(). The + * vm_account must not be used after calling this unless reinitialised + * with vm_account_init(). + */ +void vm_account_release(struct vm_account *vm_account) +{ + put_task_struct(vm_account->task); + if (vm_account->flags & VM_ACCOUNT_USER) + free_uid(vm_account->user); + + mmdrop(vm_account->mm); +} +EXPORT_SYMBOL_GPL(vm_account_release); + +/* + * Charge pages with an atomic compare and swap. Returns -ENOMEM on + * failure, 1 on success and 0 for retry. + */ +static int vm_account_cmpxchg(struct vm_account *vm_account, + unsigned long npages, unsigned long lock_limit) +{ + u64 cur_pages, new_pages; + + if (vm_account->flags & VM_ACCOUNT_USER) + cur_pages = atomic_long_read(&vm_account->user->locked_vm); + else + cur_pages = atomic64_read(&vm_account->mm->pinned_vm); + + new_pages = cur_pages + npages; + if (lock_limit != RLIM_INFINITY && new_pages > lock_limit) + return -ENOMEM; + + if (vm_account->flags & VM_ACCOUNT_USER) { + return atomic_long_cmpxchg(&vm_account->user->locked_vm, + cur_pages, new_pages) == cur_pages; + } else { + return atomic64_cmpxchg(&vm_account->mm->pinned_vm, + cur_pages, new_pages) == cur_pages; + } +} + +/** + * vm_account_pinned - Charge pinned or locked memory to the vm_account. + * @vm_account: pointer to an initialised vm_account. + * @npages: number of pages to charge. + * + * Return: 0 on success, -ENOMEM if a limit would be exceeded. + * + * Note: All pages must be explicitly uncharged with + * vm_unaccount_pinned() prior to releasing the vm_account with + * vm_account_release(). + */ +int vm_account_pinned(struct vm_account *vm_account, unsigned long npages) +{ + unsigned long lock_limit = RLIM_INFINITY; + int ret; + + if (!(vm_account->flags & VM_ACCOUNT_BYPASS) && !capable(CAP_IPC_LOCK)) + lock_limit = task_rlimit(vm_account->task, + RLIMIT_MEMLOCK) >> PAGE_SHIFT; + + while (true) { + ret = vm_account_cmpxchg(vm_account, npages, lock_limit); + if (ret > 0) + break; + else if (ret < 0) + return ret; + } + + /* + * Always add pinned pages to mm->pinned_vm even when we're + * not enforcing the limit against that. + */ + if (vm_account->flags & VM_ACCOUNT_USER) + atomic64_add(npages, &vm_account->mm->pinned_vm); + + return 0; +} +EXPORT_SYMBOL_GPL(vm_account_pinned); + +/** + * vm_unaccount_pinned - Uncharge pinned or locked memory to the vm_account. + * @vm_account: pointer to an initialised vm_account. + * @npages: number of pages to uncharge. + */ +void vm_unaccount_pinned(struct vm_account *vm_account, unsigned long npages) +{ + if (vm_account->flags & VM_ACCOUNT_USER) { + atomic_long_sub(npages, &vm_account->user->locked_vm); + atomic64_sub(npages, &vm_account->mm->pinned_vm); + } else { + atomic64_sub(npages, &vm_account->mm->pinned_vm); + } +} +EXPORT_SYMBOL_GPL(vm_unaccount_pinned); + +/** * __account_locked_vm - account locked pages to an mm's locked_vm * @mm: mm to account against * @pages: number of pages to account