From patchwork Wed Nov 27 08:31:13 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Thomas_Hellstr=C3=B6m_=28Intel=29?= X-Patchwork-Id: 11263591 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 734B21390 for ; Wed, 27 Nov 2019 08:32:36 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 5A01A2071E for ; Wed, 27 Nov 2019 08:32:36 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5A01A2071E Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=shipmail.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=dri-devel-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id A696D89C33; Wed, 27 Nov 2019 08:32:33 +0000 (UTC) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from pio-pvt-msa1.bahnhof.se (pio-pvt-msa1.bahnhof.se [79.136.2.40]) by gabe.freedesktop.org (Postfix) with ESMTPS id 3373C6E372 for ; Wed, 27 Nov 2019 08:32:16 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by pio-pvt-msa1.bahnhof.se (Postfix) with ESMTP id CE8CA41CE1; Wed, 27 Nov 2019 09:32:14 +0100 (CET) X-Virus-Scanned: Debian amavisd-new at bahnhof.se X-Spam-Flag: NO X-Spam-Score: -2.099 X-Spam-Level: X-Spam-Status: No, score=-2.099 tagged_above=-999 required=6.31 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no Received: from pio-pvt-msa1.bahnhof.se ([127.0.0.1]) by localhost (pio-pvt-msa1.bahnhof.se [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 6TwXtQa_G2Zk; Wed, 27 Nov 2019 09:32:05 +0100 (CET) Received: from mail1.shipmail.org (h-205-35.A357.priv.bahnhof.se [155.4.205.35]) (Authenticated sender: mb878879) by pio-pvt-msa1.bahnhof.se (Postfix) with ESMTPA id 11C9B41BA4; Wed, 27 Nov 2019 09:32:02 +0100 (CET) Received: from localhost.localdomain.localdomain (h-205-35.A357.priv.bahnhof.se [155.4.205.35]) by mail1.shipmail.org (Postfix) with ESMTPSA id 378D3360140; Wed, 27 Nov 2019 09:32:02 +0100 (CET) From: =?utf-8?q?Thomas_Hellstr=C3=B6m_=28VMware=29?= To: dri-devel@lists.freedesktop.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-graphics-maintainer@vmware.com Subject: [RFC PATCH 0/7] Huge page-table entries for TTM Date: Wed, 27 Nov 2019 09:31:13 +0100 Message-Id: <20191127083120.34611-1-thomas_os@shipmail.org> X-Mailer: git-send-email 2.21.0 MIME-Version: 1.0 X-Mailman-Original-DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=shipmail.org; s=mail; t=1574843522; bh=Og4luRQdluh+hAMeX8+BJha1sCbiLJlldQtmSpb2990=; h=From:To:Cc:Subject:Date:From; b=KFQNLPu2ssbAmbmSpaB/4Liu4TF37qg9ct9Iy4I7iR2RDRYvn6IUmptkZ5amxQbgW CEcw6ILpLYYfsjaNOriLp8hcEl+lVPKqDbUMwsP0w8+mVtPxs6o8RcV4Rs+oaA66y3 mFL+3G6a6skaGARalHB7+HRkxOCyRD/aYPEeGXUQ= X-Mailman-Original-Authentication-Results: pio-pvt-msa1.bahnhof.se; dkim=pass (1024-bit key; unprotected) header.d=shipmail.org header.i=@shipmail.org header.b="KFQNLPu2"; dkim-atps=neutral X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Ralph Campbell , Michal Hocko , =?utf-8?q?Thomas_Hellstr=C3=B6m?= , "Matthew Wilcox \(Oracle\)" , =?utf-8?b?SsOpcsO0bWUg?= =?utf-8?b?R2xpc3Nl?= , Andrew Morton , =?utf-8?q?Christian_K=C3=B6nig?= , "Kirill A. Shutemov" Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" In order to save TLB space and CPU usage this patchset enables huge- and giant page-table entries for TTM and TTM-enabled graphics drivers. Patch 1 introduces a vma_is_special_huge() function to make the mm code take the same path as DAX when splitting huge- and giant page table entries, (which is zapping the page-table entry and rely on re-faulting). Patch 2 makes the mm code split existing huge page-table entries on huge_fault fallbacks. Typically on COW or on buffer-objects that want write-notify. COW and write-notification is always done on the lowest page-table level. See the patch log message for additional considerations. Patch 3 introduces functions to allow the graphics drivers to manipulate the caching- and encryption flags of huge page-table entries without ugly hacks. Patch 4 implements the huge_fault handler in TTM. This enables huge page-table entries, provided that the kernel is configured to support transhuge pages, either by default or using madvise(). However, they are unlikely to be inserted unless the kernel buffer object pfns and user-space addresses align perfectly. There are various options here, but since buffer objects that reside in system pages typically start at huge page boundaries if they are backed by huge pages, we try to enforce buffer object starting pfns and user-space addresses to be huge page-size aligned if their size exceeds a huge page-size. If pud-size transhuge ("giant") pages are enabled by the arch, the same holds for those. Patch 5 implements a drm helper to align user-space addresses according to the above scheme, if possible. Patch 6 implements a TTM range manager that does the same for graphics IO memory. Patch 7 finally hooks up the helpers of patch 5 and 6 to the vmwgfx driver. A similar change is needed for graphics drivers that wants a reasonable likelyhood of actually using huge page-table entries. Finally, if a buffer object size is not huge-page or giant-page aligned, its size will NOT be inflated by this patchset. This means that the buffer object tail will use smaller size page-table entries and thus no memory overhead occurs. Drivers that want to pay the memory overhead price need to implement their own scheme to inflate buffer-object sizes. PMD size huge page-table-entries have been tested with vmwgfx and found to work well both with system memory backed and IO memory backed buffer objects. PUD size giant page-table-entries have seen limited (fault and COW) testing using a modified kernel and a fake vmwgfx TTM memory type. The vmwgfx driver does otherwise not support 1GB-size IO memory resources. Comments and suggestions welcome. Thomas Cc: Andrew Morton Cc: Michal Hocko Cc: "Matthew Wilcox (Oracle)" Cc: "Kirill A. Shutemov" Cc: Ralph Campbell Cc: "Jérôme Glisse" Cc: "Christian König"