From patchwork Sat Feb 18 00:28:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 13145412 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9128DC05027 for ; Sat, 18 Feb 2023 00:30:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3E3E26B0082; Fri, 17 Feb 2023 19:29:50 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3BB6F6B0083; Fri, 17 Feb 2023 19:29:50 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 236EF6B0085; Fri, 17 Feb 2023 19:29:50 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 12CA96B0082 for ; Fri, 17 Feb 2023 19:29:50 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id E7F1880723 for ; Sat, 18 Feb 2023 00:29:49 +0000 (UTC) X-FDA: 80478529698.05.0D2B9CE Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf08.hostedemail.com (Postfix) with ESMTP id 12926160009 for ; Sat, 18 Feb 2023 00:29:47 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=Ycfh2IW+; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf08.hostedemail.com: domain of 3-xvwYwoKCBo9J7EK67JED6EE6B4.2ECB8DKN-CCAL02A.EH6@flex--jthoughton.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3-xvwYwoKCBo9J7EK67JED6EE6B4.2ECB8DKN-CCAL02A.EH6@flex--jthoughton.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676680188; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=nWGrggQ/qDTMgdBpDhPDYEEJzTc7x7HUNebehORCAiA=; b=W2F8YCaPXhSTp59FoepV7Fa7V4wy5JX3z25ECwIC4HKHwRTYfNKSJjQmlitXM9A56bI7ef 6YACWjzLMM3ibVxOyIP9fhWAtAxi6qmjg5mbOSFhx3G+rDf1yGXDg9FDt8kTdiiVo7Iagp rfRmgvQ9ePU4piISCTks5ZbXTRc+xJo= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=Ycfh2IW+; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf08.hostedemail.com: domain of 3-xvwYwoKCBo9J7EK67JED6EE6B4.2ECB8DKN-CCAL02A.EH6@flex--jthoughton.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3-xvwYwoKCBo9J7EK67JED6EE6B4.2ECB8DKN-CCAL02A.EH6@flex--jthoughton.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676680188; a=rsa-sha256; cv=none; b=LFa6HzI0QHk9ElYpSYKMJhPaYgV4qS93BN8DLjysKGLKlzne9e902ITmLWfDtzSW0YAWy9 iW+a2+PDZVBlppEJbH0fcwFOu3ghR9xJolbPjVQOQyiMP+SOL3ytt30cV1GWKxXmhAgTb7 N/6D30kzyqQT4DX/ZROpvgT54LU7UDM= Received: by mail-yb1-f201.google.com with SMTP id y33-20020a25ad21000000b00953ffdfbe1aso2198014ybi.23 for ; Fri, 17 Feb 2023 16:29:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=nWGrggQ/qDTMgdBpDhPDYEEJzTc7x7HUNebehORCAiA=; b=Ycfh2IW+hDbVlFvLyA3lntlwjCloreR3r/qvCLXWGq6CNK6K2k2C8p8Spq55zPyhuq AAzGAWCFdAKep93lG+yL4ZIn4/dB4F5SfgWiuIci0gRz5Jc/WSpHykVxXgtPRca51cTh Kjt4vcYHHiUBWKNDbEdNkw5LmtxPtJAs961xziLXEQQmcRRKC+XTvb8X3Mv5HuNd3g65 rbEnCtYwG9f2sqWhaLJitGD3eobDHYJm3Lp8idB4WHCamsENmNmOfrndZ99VziA1718v ZFwhg4eMsDrzkJAlatdDRFvDNHgbLy+EDTZnB6N9dZ7yt4JVcAMrYLFaHtXLj4rhKRWg o1Pg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=nWGrggQ/qDTMgdBpDhPDYEEJzTc7x7HUNebehORCAiA=; b=kCABBzD/tvUcNg2qO8UduVHi0ql09U3nQB429oMQCmfdRhqqv/AEdn709UcPJNFFci MRDhzDcz+IRvub3mn20xMJQzq4tGe532EiEuRrXIN/1u4iXEizgB+BGtr3A9mvtdPWl/ nqvWlsHSN2m8RIKJJLe00Px3PLm5U0/ebOvM5F+d7Sd8I1WBrXzd5LhOYWkv3EWVkL+a NHvfc5fImmiPYd2YmtRBh/pmp2MapTpC7uhUDHbxt+VGz2CI64Gy2nvvkld8U50DoOTw 7N/FzQSbsKgbStYvrdRMIIzAR8CD1VnuYVNqwenQZ2HgbfpHS8GplG4GZn/dAvLhuKj1 qJhA== X-Gm-Message-State: AO0yUKVOQTyIjIIlZmqGXtvQkxMcX54bg1oH0qUbapFhf9PuhKfLUmU9 Y5CmojajGFHxa6vosSEpa/IfIJ9BZmAiS5Wr X-Google-Smtp-Source: AK7set8/WW6OVTCGwPxg+uQR5Lu/p2b0h5x6j4EcxX6yyi99urCYtaxSW3TCY7nRxHf6oG5i6KVmizuXRuCPg5hr X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a5b:5c3:0:b0:8ed:262d:defe with SMTP id w3-20020a5b05c3000000b008ed262ddefemr185750ybp.0.1676680187212; Fri, 17 Feb 2023 16:29:47 -0800 (PST) Date: Sat, 18 Feb 2023 00:28:19 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-47-jthoughton@google.com> Subject: [PATCH v2 46/46] selftests/mm: add HGM UFFDIO_CONTINUE and hwpoison tests From: James Houghton To: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton Cc: David Hildenbrand , David Rientjes , Axel Rasmussen , Mina Almasry , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 12926160009 X-Stat-Signature: urojamfzjayyueouie7eudotesp4kjhr X-HE-Tag: 1676680187-247339 X-HE-Meta: U2FsdGVkX1+cP81901S1DTQdNd/tYsXAhHvGqfhqmFtRrNezVZOMm4CsRRJAxntN8204Rzm+2EBxjeeeGZloejNy4W2nRkEo6srGF+GdFKl9EWa0fKq+rGt72WpvhuyOr0MYxUzq8OGKKd/7PmW5jMEfsIeuwIcG7D3mdWLysOThu+7D2UKhvX7KlfUyfl6wE28FA7swqd9GJCJXSVqI1f9nx0i0Kr7YaVQWATMJIA1myiohoyhWU3i7ClSlZQ7hMVJXg9RQTxOD8e+PipHNlNBrgr4FGRC7iE9zUfMNKTZNOJ3ty0Ljo+0witCIymByeU3qN4EYRCL/l6ZKtRk9uIm6jaMP4sTfQAY8GaN34i1lUj45rDmXcDBjDg/lDmMo/H3CDR6MGxTVtpz9t/cuFio34pBjDtpg4UIo0gOGRIB/RtPYiwH8eOEAQ2QiDQpAuyRuLBGR3Wy4GZMMA65DLvtSyuGexa6BsHPk5ENpEEgvvlvv2ss8usrgUA+yYpAMjP2cRU/Qs2exs4fCCrDusMlkllH1o9vXr89GoIm7oWkC8oWAO4b3HqqsA2Bq226fZWVlCu3JuyY16v0fOZHAF8t7fwCle4tPiT5vTPKQf+akXbmwv4QSCgVeIYhXzWyr9x7DDYKK4rnv/fzFY+W/1+dYp1SHXjdiBWAdpBDXXkgEuByV/Z1EWblPWjQSz5jA0mgcTzAOFDDizR2I9N0J+59k+Rgy0qv9XtT/bgiNd1AlxC70uA1f7n6VAPImysEiYmsMh5DXnXPVJFrogEZWCs7YYRRK3pJqtKoCW/S30KWWC4BbrUZwQvjqwXGXQbNHHDb4nkgT1AvqCge+KDlD3mZUNockGH4NFT9bsLoZHyarAa6K0sfmHOdQMnEpXKw5Wk2PK1v6uPO2dEU32duhe1IciyE9IYjaRH4F7rlw5akQUUXpgqvicmPzARtp4LDtb+YNQxvI5i0aHC7EaQH SMcxUame Bc2zC6KLt1LqWz1qzPQqKIBekU7ifIpIJRhiVurs0xqn9xXpLu0q0Ov20SX7dOMnPgMxIu/WaXWdu0s1qv87thtwFJe2+UclrgVkV6UXt6s4lqjzApagC06nss+oYcePrDeSTGmHEvq97+1LxJ9QzMAEYu/Xj2LwyYnBSyw47b5nQj6bqTA+cWA4PBjZgdUcs3nSu3hPZOiNzSO8plDBfajuyrSDpDRUrASi8CyQr8F4Jg0P6xON9hLovi/524w3GUwyG3g8EOeqFsC0t9EhxOfoHcHLypetTZFYM4iHuX6xowSQBCF95thtvc6EgjdYVvmAEH+QcWxwF/UrBQu9jqMNBSLQ4RUEZRIfbdlA3zTVzjOJ4ei5Cqjk4KPISCLvyEI0AgbJOc2R1iFjoniIL3WxJ4q6Ael1WJzzLTFE0oLbwoHC2F/5PjY74pzZ9FHrjnogZ6iUTHwqSojLCcJKAxJitC+YDpCghfCzyIzWQ0TwpbyMN32EyVR1BnziiH9bZJhaPBkbP+6MOxB3Anw8mNOjXhGoqs6QJiDjrKtSp7veDlIZYIcZrHsfGd+r1ro8irnfU6FSqKbXprgDDrwr2qPmRWDgKSllwgEfVA3tAlhkHxHw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Test that high-granularity CONTINUEs at all sizes work (exercising contiguous PTE sizes for arm64, when support is added). Also test that collapse works and hwpoison works correctly (although we aren't yet testing high-granularity poison). This test uses UFFD_FEATURE_EVENT_FORK + UFFD_REGISTER_MODE_WP to force the kernel to copy page tables on fork(), exercising the changes to copy_hugetlb_page_range(). Also test that UFFDIO_WRITEPROTECT doesn't prevent UFFDIO_CONTINUE from behaving properly (in other words, that HGM walks treat UFFD-WP markers like blank PTEs in the appropriate cases). We also test that the uffd-wp PTE markers are preserved properly. Signed-off-by: James Houghton create mode 100644 tools/testing/selftests/mm/hugetlb-hgm.c diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile index d90cdc06aa59..920baccccb9e 100644 --- a/tools/testing/selftests/mm/Makefile +++ b/tools/testing/selftests/mm/Makefile @@ -36,6 +36,7 @@ TEST_GEN_FILES += compaction_test TEST_GEN_FILES += gup_test TEST_GEN_FILES += hmm-tests TEST_GEN_FILES += hugetlb-madvise +TEST_GEN_FILES += hugetlb-hgm TEST_GEN_FILES += hugepage-mmap TEST_GEN_FILES += hugepage-mremap TEST_GEN_FILES += hugepage-shm diff --git a/tools/testing/selftests/mm/hugetlb-hgm.c b/tools/testing/selftests/mm/hugetlb-hgm.c new file mode 100644 index 000000000000..4c27a6a11818 --- /dev/null +++ b/tools/testing/selftests/mm/hugetlb-hgm.c @@ -0,0 +1,608 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Test uncommon cases in HugeTLB high-granularity mapping: + * 1. Test all supported high-granularity page sizes (with MADV_COLLAPSE). + * 2. Test MADV_HWPOISON behavior. + * 3. Test interaction with UFFDIO_WRITEPROTECT. + */ + +#define _GNU_SOURCE +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define PAGE_SIZE 4096 +#define PAGE_MASK ~(PAGE_SIZE - 1) + +#ifndef MADV_COLLAPSE +#define MADV_COLLAPSE 25 +#endif + +#ifndef MADV_SPLIT +#define MADV_SPLIT 26 +#endif + +#define PREFIX " ... " +#define ERROR_PREFIX " !!! " + +static void *sigbus_addr; +bool was_mceerr; +bool got_sigbus; +bool expecting_sigbus; + +enum test_status { + TEST_PASSED = 0, + TEST_FAILED = 1, + TEST_SKIPPED = 2, +}; + +static char *status_to_str(enum test_status status) +{ + switch (status) { + case TEST_PASSED: + return "TEST_PASSED"; + case TEST_FAILED: + return "TEST_FAILED"; + case TEST_SKIPPED: + return "TEST_SKIPPED"; + default: + return "TEST_???"; + } +} + +static int userfaultfd(int flags) +{ + return syscall(__NR_userfaultfd, flags); +} + +static int map_range(int uffd, char *addr, uint64_t length) +{ + struct uffdio_continue cont = { + .range = (struct uffdio_range) { + .start = (uint64_t)addr, + .len = length, + }, + .mode = 0, + .mapped = 0, + }; + + if (ioctl(uffd, UFFDIO_CONTINUE, &cont) < 0) { + perror(ERROR_PREFIX "UFFDIO_CONTINUE failed"); + return -1; + } + return 0; +} + +static int userfaultfd_writeprotect(int uffd, char *addr, uint64_t length, + bool protect) +{ + struct uffdio_writeprotect wp = { + .range = (struct uffdio_range) { + .start = (uint64_t)addr, + .len = length, + }, + .mode = UFFDIO_WRITEPROTECT_MODE_DONTWAKE, + }; + + if (protect) + wp.mode = UFFDIO_WRITEPROTECT_MODE_WP; + + printf(PREFIX "UFFDIO_WRITEPROTECT: %p -> %p (%sprotected)\n", addr, + addr + length, protect ? "" : "un"); + + if (ioctl(uffd, UFFDIO_WRITEPROTECT, &wp) < 0) { + perror(ERROR_PREFIX "UFFDIO_WRITEPROTECT failed"); + return -1; + } + return 0; +} + +static int check_equal(char *mapping, size_t length, char value) +{ + size_t i; + + for (i = 0; i < length; ++i) + if (mapping[i] != value) { + printf(ERROR_PREFIX "mismatch at %p (%d != %d)\n", + &mapping[i], mapping[i], value); + return -1; + } + + return 0; +} + +static int test_continues(int uffd, char *primary_map, char *secondary_map, + size_t len, bool verify) +{ + size_t offset = 0; + unsigned char iter = 0; + unsigned long pagesize = getpagesize(); + uint64_t size; + + for (size = len/2; size >= pagesize; + offset += size, size /= 2) { + iter++; + memset(secondary_map + offset, iter, size); + printf(PREFIX "UFFDIO_CONTINUE: %p -> %p = %d%s\n", + primary_map + offset, + primary_map + offset + size, + iter, + verify ? " (and verify)" : ""); + if (map_range(uffd, primary_map + offset, size)) + return -1; + if (verify && check_equal(primary_map + offset, size, iter)) + return -1; + } + return 0; +} + +static int verify_contents(char *map, size_t len, bool last_page_zero) +{ + size_t offset = 0; + int i = 0; + uint64_t size; + + for (size = len/2; size > PAGE_SIZE; offset += size, size /= 2) + if (check_equal(map + offset, size, ++i)) + return -1; + + if (last_page_zero) + if (check_equal(map + len - PAGE_SIZE, PAGE_SIZE, 0)) + return -1; + + return 0; +} + +static int test_collapse(char *primary_map, size_t len, bool verify) +{ + int ret = 0; + + printf(PREFIX "collapsing %p -> %p\n", primary_map, primary_map + len); + if (madvise(primary_map, len, MADV_COLLAPSE) < 0) { + perror(ERROR_PREFIX "collapse failed"); + return -1; + } + + if (verify) { + printf(PREFIX "verifying %p -> %p\n", primary_map, + primary_map + len); + ret = verify_contents(primary_map, len, true); + } + return ret; +} + +static void sigbus_handler(int signo, siginfo_t *info, void *context) +{ + if (!expecting_sigbus) + printf(ERROR_PREFIX "unexpected sigbus: %p\n", info->si_addr); + + got_sigbus = true; + was_mceerr = info->si_code == BUS_MCEERR_AR; + sigbus_addr = info->si_addr; + + pthread_exit(NULL); +} + +static void *access_mem(void *addr) +{ + volatile char *ptr = addr; + + /* + * Do a write without changing memory contents, as other routines will + * need to verify that mapping contents haven't changed. + * + * We do a write so that we trigger uffd-wp SIGBUSes. To test that we + * get HWPOISON SIGBUSes, we would only need to read. + */ + *ptr = *ptr; + return NULL; +} + +static int test_sigbus(char *addr, bool poison) +{ + int ret; + pthread_t pthread; + + sigbus_addr = (void *)0xBADBADBAD; + was_mceerr = false; + got_sigbus = false; + expecting_sigbus = true; + ret = pthread_create(&pthread, NULL, &access_mem, addr); + if (ret) { + printf(ERROR_PREFIX "failed to create thread: %s\n", + strerror(ret)); + goto out; + } + + pthread_join(pthread, NULL); + + ret = -1; + if (!got_sigbus) + printf(ERROR_PREFIX "didn't get a SIGBUS: %p\n", addr); + else if (sigbus_addr != addr) + printf(ERROR_PREFIX "got incorrect sigbus address: %p vs %p\n", + sigbus_addr, addr); + else if (poison && !was_mceerr) + printf(ERROR_PREFIX "didn't get an MCEERR?\n"); + else + ret = 0; +out: + expecting_sigbus = false; + return ret; +} + +static void *read_from_uffd_thd(void *arg) +{ + int uffd = *(int *)arg; + struct uffd_msg msg; + /* opened without O_NONBLOCK */ + if (read(uffd, &msg, sizeof(msg)) != sizeof(msg)) + printf(ERROR_PREFIX "reading uffd failed\n"); + + return NULL; +} + +static int read_event_from_uffd(int *uffd, pthread_t *pthread) +{ + int ret = 0; + + ret = pthread_create(pthread, NULL, &read_from_uffd_thd, (void *)uffd); + if (ret) { + printf(ERROR_PREFIX "failed to create thread: %s\n", + strerror(ret)); + return ret; + } + return 0; +} + +static int test_sigbus_range(char *primary_map, size_t len, bool hwpoison) +{ + const unsigned long pagesize = getpagesize(); + const int num_checks = 512; + unsigned long bytes_per_check = len/num_checks; + int i; + + printf(PREFIX "checking that we can't access " + "(%d addresses within %p -> %p)\n", + num_checks, primary_map, primary_map + len); + + if (pagesize > bytes_per_check) + bytes_per_check = pagesize; + + for (i = 0; i < len; i += bytes_per_check) + if (test_sigbus(primary_map + i, hwpoison) < 0) + return 1; + /* check very last byte, because we left it unmapped */ + if (test_sigbus(primary_map + len - 1, hwpoison)) + return 1; + + return 0; +} + +static enum test_status test_hwpoison(char *primary_map, size_t len) +{ + printf(PREFIX "poisoning %p -> %p\n", primary_map, primary_map + len); + if (madvise(primary_map, len, MADV_HWPOISON) < 0) { + perror(ERROR_PREFIX "MADV_HWPOISON failed"); + return TEST_SKIPPED; + } + + return test_sigbus_range(primary_map, len, true) + ? TEST_FAILED : TEST_PASSED; +} + +static int test_fork(int uffd, char *primary_map, size_t len) +{ + int status; + int ret = 0; + pid_t pid; + pthread_t uffd_thd; + + /* + * UFFD_FEATURE_EVENT_FORK will put fork event on the userfaultfd, + * which we must read, otherwise we block fork(). Setup a thread to + * read that event now. + * + * Page fault events should result in a SIGBUS, so we expect only a + * single event from the uffd (the fork event). + */ + if (read_event_from_uffd(&uffd, &uffd_thd)) + return -1; + + pid = fork(); + + if (!pid) { + /* + * Because we have UFFDIO_REGISTER_MODE_WP and + * UFFD_FEATURE_EVENT_FORK, the page tables should be copied + * exactly. + * + * Check that everything except that last 4K has correct + * contents, and then check that the last 4K gets a SIGBUS. + */ + printf(PREFIX "child validating...\n"); + ret = verify_contents(primary_map, len, false) || + test_sigbus(primary_map + len - 1, false); + ret = 0; + exit(ret ? 1 : 0); + } else { + /* wait for the child to finish. */ + waitpid(pid, &status, 0); + ret = WEXITSTATUS(status); + if (!ret) { + printf(PREFIX "parent validating...\n"); + /* Same check as the child. */ + ret = verify_contents(primary_map, len, false) || + test_sigbus(primary_map + len - 1, false); + ret = 0; + } + } + + pthread_join(uffd_thd, NULL); + return ret; + +} + +static int uffd_register(int uffd, char *primary_map, unsigned long len, + int mode) +{ + struct uffdio_register reg; + + reg.range.start = (unsigned long)primary_map; + reg.range.len = len; + reg.mode = mode; + + reg.ioctls = 0; + return ioctl(uffd, UFFDIO_REGISTER, ®); +} + +enum test_type { + TEST_DEFAULT, + TEST_UFFDWP, + TEST_HWPOISON +}; + +static enum test_status +test_hgm(int fd, size_t hugepagesize, size_t len, enum test_type type) +{ + int uffd; + char *primary_map, *secondary_map; + struct uffdio_api api; + struct sigaction new, old; + enum test_status status = TEST_SKIPPED; + bool hwpoison = type == TEST_HWPOISON; + bool uffd_wp = type == TEST_UFFDWP; + bool verify = type == TEST_DEFAULT; + int register_args; + + if (ftruncate(fd, len) < 0) { + perror(ERROR_PREFIX "ftruncate failed"); + return status; + } + + uffd = userfaultfd(O_CLOEXEC); + if (uffd < 0) { + perror(ERROR_PREFIX "uffd not created"); + return status; + } + + primary_map = mmap(NULL, len, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); + if (primary_map == MAP_FAILED) { + perror(ERROR_PREFIX "mmap for primary mapping failed"); + goto close_uffd; + } + secondary_map = mmap(NULL, len, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); + if (secondary_map == MAP_FAILED) { + perror(ERROR_PREFIX "mmap for secondary mapping failed"); + goto unmap_primary; + } + + printf(PREFIX "primary mapping: %p\n", primary_map); + printf(PREFIX "secondary mapping: %p\n", secondary_map); + + api.api = UFFD_API; + api.features = UFFD_FEATURE_SIGBUS | UFFD_FEATURE_EXACT_ADDRESS | + UFFD_FEATURE_EVENT_FORK; + if (ioctl(uffd, UFFDIO_API, &api) == -1) { + perror(ERROR_PREFIX "UFFDIO_API failed"); + goto out; + } + + if (madvise(primary_map, len, MADV_SPLIT)) { + perror(ERROR_PREFIX "MADV_SPLIT failed"); + goto out; + } + + /* + * Register with UFFDIO_REGISTER_MODE_WP to force fork() to copy page + * tables (also need UFFD_FEATURE_EVENT_FORK, which we have). + */ + register_args = UFFDIO_REGISTER_MODE_MISSING | UFFDIO_REGISTER_MODE_WP; + if (!uffd_wp) + /* + * If we're testing UFFDIO_WRITEPROTECT, then we don't want + * minor faults. With minor faults enabled, we'll get SIGBUSes + * for any minor fault, wheresa without minot faults enabled, + * writes will verify that uffd-wp PTE markers were installed + * properly. + */ + register_args |= UFFDIO_REGISTER_MODE_MINOR; + + if (uffd_register(uffd, primary_map, len, register_args)) { + perror(ERROR_PREFIX "UFFDIO_REGISTER failed"); + goto out; + } + + + new.sa_sigaction = &sigbus_handler; + new.sa_flags = SA_SIGINFO; + if (sigaction(SIGBUS, &new, &old) < 0) { + perror(ERROR_PREFIX "could not setup SIGBUS handler"); + goto out; + } + + status = TEST_FAILED; + + if (uffd_wp) { + /* + * Install uffd-wp PTE markers now. They should be preserved + * as we split the mappings with UFFDIO_CONTINUE later. + */ + if (userfaultfd_writeprotect(uffd, primary_map, len, true)) + goto done; + /* Verify that we really are write-protected. */ + if (test_sigbus(primary_map, false)) + goto done; + } + + /* + * Main piece of the test: map primary_map at all the possible + * page sizes. Starting at the hugepage size and going down to + * PAGE_SIZE. This leaves the final PAGE_SIZE piece of the mapping + * unmapped. + */ + if (test_continues(uffd, primary_map, secondary_map, len, verify)) + goto done; + + /* + * Verify that MADV_HWPOISON is able to properly poison the entire + * mapping. + */ + if (hwpoison) { + enum test_status new_status = test_hwpoison(primary_map, len); + + if (new_status != TEST_PASSED) { + status = new_status; + goto done; + } + } + + if (uffd_wp) { + /* + * Check that the uffd-wp marker we installed initially still + * exists in the unmapped 4K piece at the end the mapping. + * + * test_sigbus() will do a write. When this happens: + * 1. The page fault handler will find the uffd-wp marker and + * create a read-only PTE. + * 2. The memory access is retried, and the page fault handler + * will find that a write was attempted in a UFFD_WP VMA + * where a RO mapping exists, so SIGBUS + * (we have UFFD_FEATURE_SIGBUS). + * + * We only check the final pag because UFFDIO_CONTINUE will + * have cleared the write-protection on all the other pieces + * of the mapping. + */ + printf(PREFIX "verifying that we can't write to final page\n"); + if (test_sigbus(primary_map + len - 1, false)) + goto done; + } + + if (!hwpoison) + /* + * test_fork() will verify memory contents. We can't do + * that if memory has been poisoned. + */ + if (test_fork(uffd, primary_map, len)) + goto done; + + /* + * Check that MADV_COLLAPSE functions properly. That is: + * - the PAGE_SIZE hole we had is no longer unmapped. + * - poisoned regions are still poisoned. + * + * Verify the data is correct if we haven't poisoned. + */ + if (test_collapse(primary_map, len, !hwpoison)) + goto done; + /* + * Verify that memory is still poisoned. + */ + if (hwpoison && test_sigbus_range(primary_map, len, true)) + goto done; + + status = TEST_PASSED; + +done: + if (ftruncate(fd, 0) < 0) { + perror(ERROR_PREFIX "ftruncate back to 0 failed"); + status = TEST_FAILED; + } + +out: + munmap(secondary_map, len); +unmap_primary: + munmap(primary_map, len); +close_uffd: + close(uffd); + return status; +} + +int main(void) +{ + int fd; + struct statfs file_stat; + size_t hugepagesize; + size_t len; + enum test_status status; + int ret = 0; + + fd = memfd_create("hugetlb_tmp", MFD_HUGETLB); + if (fd < 0) { + perror(ERROR_PREFIX "could not open hugetlbfs file"); + return -1; + } + + memset(&file_stat, 0, sizeof(file_stat)); + if (fstatfs(fd, &file_stat)) { + perror(ERROR_PREFIX "fstatfs failed"); + goto close; + } + if (file_stat.f_type != HUGETLBFS_MAGIC) { + printf(ERROR_PREFIX "not hugetlbfs file\n"); + goto close; + } + + hugepagesize = file_stat.f_bsize; + len = 2 * hugepagesize; + + printf("HGM regular test...\n"); + status = test_hgm(fd, hugepagesize, len, TEST_DEFAULT); + printf("HGM regular test: %s\n", status_to_str(status)); + if (status == TEST_FAILED) + ret = -1; + + printf("HGM uffd-wp test...\n"); + status = test_hgm(fd, hugepagesize, len, TEST_UFFDWP); + printf("HGM uffd-wp test: %s\n", status_to_str(status)); + if (status == TEST_FAILED) + ret = -1; + + printf("HGM hwpoison test...\n"); + status = test_hgm(fd, hugepagesize, len, TEST_HWPOISON); + printf("HGM hwpoison test: %s\n", status_to_str(status)); + if (status == TEST_FAILED) + ret = -1; +close: + close(fd); + + return ret; +}