From patchwork Fri Oct 27 18:21:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 13439020 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id F373AC25B72 for ; Fri, 27 Oct 2023 18:24:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:Reply-To:List-Subscribe:List-Help: List-Post:List-Archive:List-Unsubscribe:List-Id:Cc:To:From:Subject:Message-ID :References:Mime-Version:In-Reply-To:Date:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=TbH/8ZTU3GYEZpheSByQL0UpNxxtnKexP1w9sYUWu+o=; b=DEv4erUamFRBj7 +SEmrrTqonXe0CJ95f0urfpT8VA0M8EabmAQkmC9wXY4CLQCwtq+GiE7rJpk6/14rUiSaNHAO1gvJ JR4OTbSe232BZNbXx0nt4dOlfbMgqalWCUX/oTY7R58Y+J44k0GWtovx6mTvXsAB4wdtsPLZQNdEL MvG7LXrSQl6fSwgBAGpDVkoPhChQ7qZNXkd7NAvoc1bE63yJi/fxuEOXi9mwrn9UJq3MSVEis7Phj /kQJyS9WQAcMXVBn/uXhHx4p4OYGVPqGVgAXB1AXnXqMruj34PPWWi+9IWl3+03rcNcRRXKRMFwqU 8O4UR5OEf7nhDFmTUJ7A==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qwRV0-00GyAC-19; Fri, 27 Oct 2023 18:23:58 +0000 Received: from casper.infradead.org ([2001:8b0:10b:1236::1]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1qwRU1-00GxAL-2w for linux-riscv@bombadil.infradead.org; Fri, 27 Oct 2023 18:22:58 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Transfer-Encoding:Content-Type: Cc:To:From:Subject:Message-ID:References:Mime-Version:In-Reply-To:Date: Reply-To:Sender:Content-ID:Content-Description; bh=PHuKmbJK++WVHRTr2ixGhZnz3PjglwNcwDhJuJ6xsm0=; b=RZUILr2OoestlskP8laIc2jR4b 72gUwFqVl7snDgkydNQ7ag6eYvANwPe5gU5hBwbaSF4g1btWK4DoBtOI1/R1QBLSRX99o+aog6R0n bWyfhtyBK181isGpxhJpEEnxQNgdULcKrNtJl1Z6M/VqGVNI/Ic8EKA8y4yft/IuTho69l3KxftgL rYE8KrA99qvkGJpYXvH27Ks4jz/6NPD1bgu8t3BOu2xIX0WyublY4iOMU2PW1jaun6jr2llHbzw31 ZR9UF1kyb4bNOX10NjfTumO83lMjl424D5JXq+sngpKBq5tVpvV/wMk0tP2rZ0A92tiE5nlpmCzyj 3JmormIw==; Received: from mail-yw1-x114a.google.com ([2607:f8b0:4864:20::114a]) by casper.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1qwRTx-004sxq-Vp for linux-riscv@lists.infradead.org; Fri, 27 Oct 2023 18:22:56 +0000 Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-5af592fed43so14017417b3.2 for ; Fri, 27 Oct 2023 11:22:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698430971; x=1699035771; darn=lists.infradead.org; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:reply-to:from:to:cc:subject:date :message-id:reply-to; bh=PHuKmbJK++WVHRTr2ixGhZnz3PjglwNcwDhJuJ6xsm0=; b=D1XaeujpheAVaCXJOApyL9wWeLaNoUqEtf2fx9meAI/l+pu6/8ZngeSRO0s/NddARM 5G+2ILLVDhLhVS+GHyvpt3kqSL201EZUobeAhdylgfqXld5ptlydY8DzgtotHZvnVreb m+ocQAz/pp3ijeCxm2nr3tpBMYkaUE7sVhZTjgbzM1EhnCa4lsoU7ldCgk/8fjd9lwYY Fi5tyLwAJe+eBjFgvYLGtaLuQAI0nSrST6v/wowF1V5QZz0gc4/egL7veGV98CGDxxIK bkeOdiJAoHN24WxhjaxErVp3WJRwQra1Y6WMW6zgrQdznZc3a3T7Vwix51Q2vco268IW RG6Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698430971; x=1699035771; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:reply-to:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=PHuKmbJK++WVHRTr2ixGhZnz3PjglwNcwDhJuJ6xsm0=; b=JwXbjS6kyqTjyx1MyzyTXSTlW37cyVzU4yZZvG1V5RV9mYBcmif/cevHJCJexJxH2H GCjygLKSGtGmVGBDZHrWvb0ADsraf44fEe29yIz4O07zO+g2EzXr71H25LJIeUmrUJJe FNAxbdTvvRpR6JJEQSd9zQ0AO6ZGp0r9UqLX4PwYX6vnfI2b3hsOVXcPMvMJ7Ih5iwiZ 79wsQT+uV2xVAa5p3ohLDa6F0DorzItp50SdQCzBleBNyWnxEcZ9Zb6nXPoAYEGbyVM2 Jwn5OItrdpolXi/SkS2tkYfMUNeTPaEATV2HcNVAxjZOtFWp566TteqhvqOvKYRx2nd6 igbA== X-Gm-Message-State: AOJu0YwW47qGvlxccQrPLTEqY5wqdhXAJcjBrF2Hh2uiXpXDMByfwI7e pVWvhiwffRNspESOjD1tcZCnmb2ZU84= X-Google-Smtp-Source: AGHT+IHeAi8c5J1JHmkq3P4PpsdNdjX6Wq3XK5NOfAhf9NwJS3eapFdTXpLG2igv2I31VgMM7Gn14gWqO18= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a0d:ca97:0:b0:5a1:d4a5:7dff with SMTP id m145-20020a0dca97000000b005a1d4a57dffmr73743ywd.6.1698430971510; Fri, 27 Oct 2023 11:22:51 -0700 (PDT) Date: Fri, 27 Oct 2023 11:21:55 -0700 In-Reply-To: <20231027182217.3615211-1-seanjc@google.com> Mime-Version: 1.0 References: <20231027182217.3615211-1-seanjc@google.com> X-Mailer: git-send-email 2.42.0.820.g83a721a137-goog Message-ID: <20231027182217.3615211-14-seanjc@google.com> Subject: [PATCH v13 13/35] KVM: Introduce per-page memory attributes From: Sean Christopherson To: Paolo Bonzini , Marc Zyngier , Oliver Upton , Huacai Chen , Michael Ellerman , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Sean Christopherson , Alexander Viro , Christian Brauner , "Matthew Wilcox (Oracle)" , Andrew Morton Cc: kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Xiaoyao Li , Xu Yilun , Chao Peng , Fuad Tabba , Jarkko Sakkinen , Anish Moorthy , David Matlack , Yu Zhang , Isaku Yamahata , " =?utf-8?q?Micka=C3=ABl_Sala?= =?utf-8?q?=C3=BCn?= " , Vlastimil Babka , Vishal Annapurve , Ackerley Tng , Maciej Szmigiero , David Hildenbrand , Quentin Perret , Michael Roth , Wang , Liam Merwick , Isaku Yamahata , "Kirill A . Shutemov" X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231027_192254_071013_0736C360 X-CRM114-Status: GOOD ( 24.86 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Sean Christopherson Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org From: Chao Peng In confidential computing usages, whether a page is private or shared is necessary information for KVM to perform operations like page fault handling, page zapping etc. There are other potential use cases for per-page memory attributes, e.g. to make memory read-only (or no-exec, or exec-only, etc.) without having to modify memslots. Introduce two ioctls (advertised by KVM_CAP_MEMORY_ATTRIBUTES) to allow userspace to operate on the per-page memory attributes. - KVM_SET_MEMORY_ATTRIBUTES to set the per-page memory attributes to a guest memory range. - KVM_GET_SUPPORTED_MEMORY_ATTRIBUTES to return the KVM supported memory attributes. Use an xarray to store the per-page attributes internally, with a naive, not fully optimized implementation, i.e. prioritize correctness over performance for the initial implementation. Use bit 3 for the PRIVATE attribute so that KVM can use bits 0-2 for RWX attributes/protections in the future, e.g. to give userspace fine-grained control over read, write, and execute protections for guest memory. Provide arch hooks for handling attribute changes before and after common code sets the new attributes, e.g. x86 will use the "pre" hook to zap all relevant mappings, and the "post" hook to track whether or not hugepages can be used to map the range. To simplify the implementation wrap the entire sequence with kvm_mmu_invalidate_{begin,end}() even though the operation isn't strictly guaranteed to be an invalidation. For the initial use case, x86 *will* always invalidate memory, and preventing arch code from creating new mappings while the attributes are in flux makes it much easier to reason about the correctness of consuming attributes. It's possible that future usages may not require an invalidation, e.g. if KVM ends up supporting RWX protections and userspace grants _more_ protections, but again opt for simplicity and punt optimizations to if/when they are needed. Suggested-by: Sean Christopherson Link: https://lore.kernel.org/all/Y2WB48kD0J4VGynX@google.com Cc: Fuad Tabba Cc: Xu Yilun Cc: Mickaël Salaün Signed-off-by: Chao Peng Co-developed-by: Sean Christopherson Signed-off-by: Sean Christopherson --- Documentation/virt/kvm/api.rst | 36 +++++ include/linux/kvm_host.h | 18 +++ include/uapi/linux/kvm.h | 13 ++ virt/kvm/Kconfig | 4 + virt/kvm/kvm_main.c | 233 +++++++++++++++++++++++++++++++++ 5 files changed, 304 insertions(+) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 860216536810..e2252c748fd6 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -6091,6 +6091,42 @@ applied. See KVM_SET_USER_MEMORY_REGION. +4.140 KVM_SET_MEMORY_ATTRIBUTES +------------------------------- + +:Capability: KVM_CAP_MEMORY_ATTRIBUTES +:Architectures: x86 +:Type: vm ioctl +:Parameters: struct kvm_memory_attributes(in) +:Returns: 0 on success, <0 on error + +KVM_SET_MEMORY_ATTRIBUTES allows userspace to set memory attributes for a range +of guest physical memory. + +:: + + struct kvm_memory_attributes { + __u64 address; + __u64 size; + __u64 attributes; + __u64 flags; + }; + + #define KVM_MEMORY_ATTRIBUTE_PRIVATE (1ULL << 3) + +The address and size must be page aligned. The supported attributes can be +retrieved via ioctl(KVM_CHECK_EXTENSION) on KVM_CAP_MEMORY_ATTRIBUTES. If +executed on a VM, KVM_CAP_MEMORY_ATTRIBUTES precisely returns the attributes +supported by that VM. If executed at system scope, KVM_CAP_MEMORY_ATTRIBUTES +returns all attributes supported by KVM. The only attribute defined at this +time is KVM_MEMORY_ATTRIBUTE_PRIVATE, which marks the associated gfn as being +guest private memory. + +Note, there is no "get" API. Userspace is responsible for explicitly tracking +the state of a gfn/page as needed. + +The "flags" field is reserved for future extensions and must be '0'. + 5. The kvm_run structure ======================== diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 89c1a991a3b8..df573229651b 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -256,6 +256,7 @@ int kvm_async_pf_wakeup_all(struct kvm_vcpu *vcpu); #ifdef CONFIG_KVM_GENERIC_MMU_NOTIFIER union kvm_mmu_notifier_arg { pte_t pte; + unsigned long attributes; }; struct kvm_gfn_range { @@ -808,6 +809,9 @@ struct kvm { #ifdef CONFIG_HAVE_KVM_PM_NOTIFIER struct notifier_block pm_notifier; +#endif +#ifdef CONFIG_KVM_GENERIC_MEMORY_ATTRIBUTES + struct xarray mem_attr_array; #endif char stats_id[KVM_STATS_NAME_SIZE]; }; @@ -2340,4 +2344,18 @@ static inline void kvm_prepare_memory_fault_exit(struct kvm_vcpu *vcpu, vcpu->run->memory_fault.flags = 0; } +#ifdef CONFIG_KVM_GENERIC_MEMORY_ATTRIBUTES +static inline unsigned long kvm_get_memory_attributes(struct kvm *kvm, gfn_t gfn) +{ + return xa_to_value(xa_load(&kvm->mem_attr_array, gfn)); +} + +bool kvm_range_has_memory_attributes(struct kvm *kvm, gfn_t start, gfn_t end, + unsigned long attrs); +bool kvm_arch_pre_set_memory_attributes(struct kvm *kvm, + struct kvm_gfn_range *range); +bool kvm_arch_post_set_memory_attributes(struct kvm *kvm, + struct kvm_gfn_range *range); +#endif /* CONFIG_KVM_GENERIC_MEMORY_ATTRIBUTES */ + #endif diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 7ae9987b48dd..547837feaa28 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -1211,6 +1211,7 @@ struct kvm_ppc_resize_hpt { #define KVM_CAP_ARM_SUPPORTED_BLOCK_SIZES 229 #define KVM_CAP_USER_MEMORY2 230 #define KVM_CAP_MEMORY_FAULT_INFO 231 +#define KVM_CAP_MEMORY_ATTRIBUTES 232 #ifdef KVM_CAP_IRQ_ROUTING @@ -2277,4 +2278,16 @@ struct kvm_s390_zpci_op { /* flags for kvm_s390_zpci_op->u.reg_aen.flags */ #define KVM_S390_ZPCIOP_REGAEN_HOST (1 << 0) +/* Available with KVM_CAP_MEMORY_ATTRIBUTES */ +#define KVM_SET_MEMORY_ATTRIBUTES _IOW(KVMIO, 0xd2, struct kvm_memory_attributes) + +struct kvm_memory_attributes { + __u64 address; + __u64 size; + __u64 attributes; + __u64 flags; +}; + +#define KVM_MEMORY_ATTRIBUTE_PRIVATE (1ULL << 3) + #endif /* __LINUX_KVM_H */ diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig index ecae2914c97e..5bd7fcaf9089 100644 --- a/virt/kvm/Kconfig +++ b/virt/kvm/Kconfig @@ -96,3 +96,7 @@ config KVM_GENERIC_HARDWARE_ENABLING config KVM_GENERIC_MMU_NOTIFIER select MMU_NOTIFIER bool + +config KVM_GENERIC_MEMORY_ATTRIBUTES + select KVM_GENERIC_MMU_NOTIFIER + bool diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 302ccb87b4c1..78a0b09ef2a5 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1218,6 +1218,9 @@ static struct kvm *kvm_create_vm(unsigned long type, const char *fdname) spin_lock_init(&kvm->mn_invalidate_lock); rcuwait_init(&kvm->mn_memslots_update_rcuwait); xa_init(&kvm->vcpu_array); +#ifdef CONFIG_KVM_GENERIC_MEMORY_ATTRIBUTES + xa_init(&kvm->mem_attr_array); +#endif INIT_LIST_HEAD(&kvm->gpc_list); spin_lock_init(&kvm->gpc_lock); @@ -1398,6 +1401,9 @@ static void kvm_destroy_vm(struct kvm *kvm) } cleanup_srcu_struct(&kvm->irq_srcu); cleanup_srcu_struct(&kvm->srcu); +#ifdef CONFIG_KVM_GENERIC_MEMORY_ATTRIBUTES + xa_destroy(&kvm->mem_attr_array); +#endif kvm_arch_free_vm(kvm); preempt_notifier_dec(); hardware_disable_all(); @@ -2396,6 +2402,210 @@ static int kvm_vm_ioctl_clear_dirty_log(struct kvm *kvm, } #endif /* CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT */ +#ifdef CONFIG_KVM_GENERIC_MEMORY_ATTRIBUTES +/* + * Returns true if _all_ gfns in the range [@start, @end) have attributes + * matching @attrs. + */ +bool kvm_range_has_memory_attributes(struct kvm *kvm, gfn_t start, gfn_t end, + unsigned long attrs) +{ + XA_STATE(xas, &kvm->mem_attr_array, start); + unsigned long index; + bool has_attrs; + void *entry; + + rcu_read_lock(); + + if (!attrs) { + has_attrs = !xas_find(&xas, end - 1); + goto out; + } + + has_attrs = true; + for (index = start; index < end; index++) { + do { + entry = xas_next(&xas); + } while (xas_retry(&xas, entry)); + + if (xas.xa_index != index || xa_to_value(entry) != attrs) { + has_attrs = false; + break; + } + } + +out: + rcu_read_unlock(); + return has_attrs; +} + +static u64 kvm_supported_mem_attributes(struct kvm *kvm) +{ + if (!kvm) + return KVM_MEMORY_ATTRIBUTE_PRIVATE; + + return 0; +} + +static __always_inline void kvm_handle_gfn_range(struct kvm *kvm, + struct kvm_mmu_notifier_range *range) +{ + struct kvm_gfn_range gfn_range; + struct kvm_memory_slot *slot; + struct kvm_memslots *slots; + struct kvm_memslot_iter iter; + bool found_memslot = false; + bool ret = false; + int i; + + gfn_range.arg = range->arg; + gfn_range.may_block = range->may_block; + + /* + * If/when KVM supports more attributes beyond private .vs shared, this + * _could_ set only_{private,shared} appropriately if the entire target + * range already has the desired private vs. shared state (it's unclear + * if that is a net win). For now, KVM reaches this point if and only + * if the private flag is being toggled, i.e. all mappings are in play. + */ + gfn_range.only_private = false; + gfn_range.only_shared = false; + + for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) { + slots = __kvm_memslots(kvm, i); + + kvm_for_each_memslot_in_gfn_range(&iter, slots, range->start, range->end) { + slot = iter.slot; + gfn_range.slot = slot; + + gfn_range.start = max(range->start, slot->base_gfn); + gfn_range.end = min(range->end, slot->base_gfn + slot->npages); + if (gfn_range.start >= gfn_range.end) + continue; + + if (!found_memslot) { + found_memslot = true; + KVM_MMU_LOCK(kvm); + if (!IS_KVM_NULL_FN(range->on_lock)) + range->on_lock(kvm); + } + + ret |= range->handler(kvm, &gfn_range); + } + } + + if (range->flush_on_ret && ret) + kvm_flush_remote_tlbs(kvm); + + if (found_memslot) + KVM_MMU_UNLOCK(kvm); +} + +static bool kvm_pre_set_memory_attributes(struct kvm *kvm, + struct kvm_gfn_range *range) +{ + /* + * Unconditionally add the range to the invalidation set, regardless of + * whether or not the arch callback actually needs to zap SPTEs. E.g. + * if KVM supports RWX attributes in the future and the attributes are + * going from R=>RW, zapping isn't strictly necessary. Unconditionally + * adding the range allows KVM to require that MMU invalidations add at + * least one range between begin() and end(), e.g. allows KVM to detect + * bugs where the add() is missed. Rexlaing the rule *might* be safe, + * but it's not obvious that allowing new mappings while the attributes + * are in flux is desirable or worth the complexity. + */ + kvm_mmu_invalidate_range_add(kvm, range->start, range->end); + + return kvm_arch_pre_set_memory_attributes(kvm, range); +} + +/* Set @attributes for the gfn range [@start, @end). */ +static int kvm_vm_set_mem_attributes(struct kvm *kvm, gfn_t start, gfn_t end, + unsigned long attributes) +{ + struct kvm_mmu_notifier_range pre_set_range = { + .start = start, + .end = end, + .handler = kvm_pre_set_memory_attributes, + .on_lock = kvm_mmu_invalidate_begin, + .flush_on_ret = true, + .may_block = true, + }; + struct kvm_mmu_notifier_range post_set_range = { + .start = start, + .end = end, + .arg.attributes = attributes, + .handler = kvm_arch_post_set_memory_attributes, + .on_lock = kvm_mmu_invalidate_end, + .may_block = true, + }; + unsigned long i; + void *entry; + int r = 0; + + entry = attributes ? xa_mk_value(attributes) : NULL; + + mutex_lock(&kvm->slots_lock); + + /* Nothing to do if the entire range as the desired attributes. */ + if (kvm_range_has_memory_attributes(kvm, start, end, attributes)) + goto out_unlock; + + /* + * Reserve memory ahead of time to avoid having to deal with failures + * partway through setting the new attributes. + */ + for (i = start; i < end; i++) { + r = xa_reserve(&kvm->mem_attr_array, i, GFP_KERNEL_ACCOUNT); + if (r) + goto out_unlock; + } + + kvm_handle_gfn_range(kvm, &pre_set_range); + + for (i = start; i < end; i++) { + r = xa_err(xa_store(&kvm->mem_attr_array, i, entry, + GFP_KERNEL_ACCOUNT)); + KVM_BUG_ON(r, kvm); + } + + kvm_handle_gfn_range(kvm, &post_set_range); + +out_unlock: + mutex_unlock(&kvm->slots_lock); + + return r; +} +static int kvm_vm_ioctl_set_mem_attributes(struct kvm *kvm, + struct kvm_memory_attributes *attrs) +{ + gfn_t start, end; + + /* flags is currently not used. */ + if (attrs->flags) + return -EINVAL; + if (attrs->attributes & ~kvm_supported_mem_attributes(kvm)) + return -EINVAL; + if (attrs->size == 0 || attrs->address + attrs->size < attrs->address) + return -EINVAL; + if (!PAGE_ALIGNED(attrs->address) || !PAGE_ALIGNED(attrs->size)) + return -EINVAL; + + start = attrs->address >> PAGE_SHIFT; + end = (attrs->address + attrs->size) >> PAGE_SHIFT; + + /* + * xarray tracks data using "unsigned long", and as a result so does + * KVM. For simplicity, supports generic attributes only on 64-bit + * architectures. + */ + BUILD_BUG_ON(sizeof(attrs->attributes) != sizeof(unsigned long)); + + return kvm_vm_set_mem_attributes(kvm, start, end, attrs->attributes); +} +#endif /* CONFIG_KVM_GENERIC_MEMORY_ATTRIBUTES */ + struct kvm_memory_slot *gfn_to_memslot(struct kvm *kvm, gfn_t gfn) { return __gfn_to_memslot(kvm_memslots(kvm), gfn); @@ -4640,6 +4850,17 @@ static int kvm_vm_ioctl_check_extension_generic(struct kvm *kvm, long arg) case KVM_CAP_BINARY_STATS_FD: case KVM_CAP_SYSTEM_EVENT_DATA: return 1; +#ifdef CONFIG_KVM_GENERIC_MEMORY_ATTRIBUTES + case KVM_CAP_MEMORY_ATTRIBUTES: + u64 attrs = kvm_supported_mem_attributes(kvm); + + r = -EFAULT; + if (copy_to_user(argp, &attrs, sizeof(attrs))) + goto out; + r = 0; + break; + } +#endif default: break; } @@ -5022,6 +5243,18 @@ static long kvm_vm_ioctl(struct file *filp, break; } #endif /* CONFIG_HAVE_KVM_IRQ_ROUTING */ +#ifdef CONFIG_KVM_GENERIC_MEMORY_ATTRIBUTES + case KVM_SET_MEMORY_ATTRIBUTES: { + struct kvm_memory_attributes attrs; + + r = -EFAULT; + if (copy_from_user(&attrs, argp, sizeof(attrs))) + goto out; + + r = kvm_vm_ioctl_set_mem_attributes(kvm, &attrs); + break; + } +#endif /* CONFIG_KVM_GENERIC_MEMORY_ATTRIBUTES */ case KVM_CREATE_DEVICE: { struct kvm_create_device cd;