Message ID | 20250211121128.703390-10-tabba@google.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | KVM: Mapping guest_memfd backed memory at the host for software protected VMs | expand |
Hi Fuad, On Tuesday 11 Feb 2025 at 12:11:25 (+0000), Fuad Tabba wrote: > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h > index 117937a895da..f155d3781e08 100644 > --- a/include/uapi/linux/kvm.h > +++ b/include/uapi/linux/kvm.h > @@ -652,6 +652,12 @@ struct kvm_enable_cap { > #define KVM_VM_TYPE_ARM_IPA_SIZE_MASK 0xffULL > #define KVM_VM_TYPE_ARM_IPA_SIZE(x) \ > ((x) & KVM_VM_TYPE_ARM_IPA_SIZE_MASK) > + > +#define KVM_VM_TYPE_ARM_SW_PROTECTED (1UL << 9) FWIW, the downstream Android code has used bit 31 since forever for that. Although I very much believe that upstream should not care about the downstream mess in general, in this particular instance bit 9 really isn't superior in any way, and there's a bunch of existing userspace code that uses bit 31 today as we speak. It is very much Android's problem to update these userspace programs if we do go with bit 9 upstream, but I don't really see how that would benefit upstream either. So, given that there is no maintenance cost for upstream to use bit 31 instead of 9, I'd vote for using bit 31 and ease the landing with existing userspace code, unless folks are really opinionated with this stuff :) Thanks, Quentin > +#define KVM_VM_TYPE_MASK (KVM_VM_TYPE_ARM_IPA_SIZE_MASK | \ > + KVM_VM_TYPE_ARM_SW_PROTECTED) > + > /* > * ioctls for /dev/kvm fds: > */ > -- > 2.48.1.502.g6dc24dfdaf-goog >
Hi Quentin, On Tue, 11 Feb 2025 at 16:12, Quentin Perret <qperret@google.com> wrote: > > Hi Fuad, > > On Tuesday 11 Feb 2025 at 12:11:25 (+0000), Fuad Tabba wrote: > > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h > > index 117937a895da..f155d3781e08 100644 > > --- a/include/uapi/linux/kvm.h > > +++ b/include/uapi/linux/kvm.h > > @@ -652,6 +652,12 @@ struct kvm_enable_cap { > > #define KVM_VM_TYPE_ARM_IPA_SIZE_MASK 0xffULL > > #define KVM_VM_TYPE_ARM_IPA_SIZE(x) \ > > ((x) & KVM_VM_TYPE_ARM_IPA_SIZE_MASK) > > + > > +#define KVM_VM_TYPE_ARM_SW_PROTECTED (1UL << 9) > > FWIW, the downstream Android code has used bit 31 since forever > for that. > > Although I very much believe that upstream should not care about the > downstream mess in general, in this particular instance bit 9 really > isn't superior in any way, and there's a bunch of existing userspace > code that uses bit 31 today as we speak. It is very much Android's > problem to update these userspace programs if we do go with bit 9 > upstream, but I don't really see how that would benefit upstream > either. > > So, given that there is no maintenance cost for upstream to use bit 31 > instead of 9, I'd vote for using bit 31 and ease the landing with > existing userspace code, unless folks are really opinionated with this > stuff :) My thinking is that this bit does _not_ mean pKVM. It means an experimental software VM that is similar to the x86 KVM_X86_SW_PROTECTED_VM. Hence why I didn't choose bit 31. From Documentation/virt/kvm/api.rst (for x86): ''' Note, KVM_X86_SW_PROTECTED_VM is currently only for development and testing. Do not use KVM_X86_SW_PROTECTED_VM for "real" VMs, and especially not in production. The behavior and effective ABI for software-protected VMs is unstable. ''' which is similar to the documentation I added here. Cheers, /fuad > Thanks, > Quentin > > > +#define KVM_VM_TYPE_MASK (KVM_VM_TYPE_ARM_IPA_SIZE_MASK | \ > > + KVM_VM_TYPE_ARM_SW_PROTECTED) > > + > > /* > > * ioctls for /dev/kvm fds: > > */ > > -- > > 2.48.1.502.g6dc24dfdaf-goog > >
On Tuesday 11 Feb 2025 at 16:17:25 (+0000), Fuad Tabba wrote: > Hi Quentin, > > On Tue, 11 Feb 2025 at 16:12, Quentin Perret <qperret@google.com> wrote: > > > > Hi Fuad, > > > > On Tuesday 11 Feb 2025 at 12:11:25 (+0000), Fuad Tabba wrote: > > > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h > > > index 117937a895da..f155d3781e08 100644 > > > --- a/include/uapi/linux/kvm.h > > > +++ b/include/uapi/linux/kvm.h > > > @@ -652,6 +652,12 @@ struct kvm_enable_cap { > > > #define KVM_VM_TYPE_ARM_IPA_SIZE_MASK 0xffULL > > > #define KVM_VM_TYPE_ARM_IPA_SIZE(x) \ > > > ((x) & KVM_VM_TYPE_ARM_IPA_SIZE_MASK) > > > + > > > +#define KVM_VM_TYPE_ARM_SW_PROTECTED (1UL << 9) > > > > FWIW, the downstream Android code has used bit 31 since forever > > for that. > > > > Although I very much believe that upstream should not care about the > > downstream mess in general, in this particular instance bit 9 really > > isn't superior in any way, and there's a bunch of existing userspace > > code that uses bit 31 today as we speak. It is very much Android's > > problem to update these userspace programs if we do go with bit 9 > > upstream, but I don't really see how that would benefit upstream > > either. > > > > So, given that there is no maintenance cost for upstream to use bit 31 > > instead of 9, I'd vote for using bit 31 and ease the landing with > > existing userspace code, unless folks are really opinionated with this > > stuff :) > > My thinking is that this bit does _not_ mean pKVM. It means an > experimental software VM that is similar to the x86 > KVM_X86_SW_PROTECTED_VM. Hence why I didn't choose bit 31. > > From Documentation/virt/kvm/api.rst (for x86): > > ''' > Note, KVM_X86_SW_PROTECTED_VM is currently only for development and testing. > Do not use KVM_X86_SW_PROTECTED_VM for "real" VMs, and especially not in > production. The behavior and effective ABI for software-protected VMs is > unstable. > ''' > > which is similar to the documentation I added here. Aha, I see, but are we going to allocate _another_ bit for protected VMs proper once they're supported? Or just update the doc for the existing bit? If the latter, then I guess this discussion can still happen :) Thanks, Quentin
Hi Quentin, On Tue, 2025-02-11 at 16:29 +0000, Quentin Perret wrote:> On Tuesday 11 Feb 2025 at 16:17:25 (+0000), Fuad Tabba wrote: >> Hi Quentin, >> >> On Tue, 11 Feb 2025 at 16:12, Quentin Perret <qperret@google.com> wrote: >>> >>> Hi Fuad, >>> >>> On Tuesday 11 Feb 2025 at 12:11:25 (+0000), Fuad Tabba wrote: >>>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h >>>> index 117937a895da..f155d3781e08 100644 >>>> --- a/include/uapi/linux/kvm.h >>>> +++ b/include/uapi/linux/kvm.h >>>> @@ -652,6 +652,12 @@ struct kvm_enable_cap { >>>> #define KVM_VM_TYPE_ARM_IPA_SIZE_MASK 0xffULL >>>> #define KVM_VM_TYPE_ARM_IPA_SIZE(x) \ >>>> ((x) & KVM_VM_TYPE_ARM_IPA_SIZE_MASK) >>>> + >>>> +#define KVM_VM_TYPE_ARM_SW_PROTECTED (1UL << 9) >>> >>> FWIW, the downstream Android code has used bit 31 since forever >>> for that. >>> >>> Although I very much believe that upstream should not care about the >>> downstream mess in general, in this particular instance bit 9 really >>> isn't superior in any way, and there's a bunch of existing userspace >>> code that uses bit 31 today as we speak. It is very much Android's >>> problem to update these userspace programs if we do go with bit 9 >>> upstream, but I don't really see how that would benefit upstream >>> either. >>> >>> So, given that there is no maintenance cost for upstream to use bit 31 >>> instead of 9, I'd vote for using bit 31 and ease the landing with >>> existing userspace code, unless folks are really opinionated with this >>> stuff :) >> >> My thinking is that this bit does _not_ mean pKVM. It means an >> experimental software VM that is similar to the x86 >> KVM_X86_SW_PROTECTED_VM. Hence why I didn't choose bit 31. >> >> From Documentation/virt/kvm/api.rst (for x86): >> >> ''' >> Note, KVM_X86_SW_PROTECTED_VM is currently only for development and testing. >> Do not use KVM_X86_SW_PROTECTED_VM for "real" VMs, and especially not in >> production. The behavior and effective ABI for software-protected VMs is >> unstable. >> ''' >> >> which is similar to the documentation I added here. > > Aha, I see, but are we going to allocate _another_ bit for protected VMs > proper once they're supported? Or just update the doc for the existing > bit? If the latter, then I guess this discussion can still happen :) I was hoping that SW_PROTECTED_VM will be the VM type that something like Firecracker could use, e.g. an interface to guest_memfd specifically _without_ pKVM, as Fuad was saying. > Thanks, > Quentin Best, Patrick
Hi Patrick, On Tuesday 11 Feb 2025 at 16:32:31 (+0000), Patrick Roy wrote: > I was hoping that SW_PROTECTED_VM will be the VM type that something > like Firecracker could use, e.g. an interface to guest_memfd specifically > _without_ pKVM, as Fuad was saying. I had, probably incorrectly, assumed that we'd eventually want to allow gmem for all VMs, including traditional KVM VMs that don't have anything special. Perhaps the gmem support could be exposed via a KVM_CAP in this case? Anyway, no objection to the proposed approach in this patch assuming we will eventually have HW_PROTECTED_VM for pKVM VMs, and that _that_ can be bit 31 :). Thanks, Quentin
On Tuesday 11 Feb 2025 at 17:09:20 (+0000), Quentin Perret wrote: > Hi Patrick, > > On Tuesday 11 Feb 2025 at 16:32:31 (+0000), Patrick Roy wrote: > > I was hoping that SW_PROTECTED_VM will be the VM type that something > > like Firecracker could use, e.g. an interface to guest_memfd specifically > > _without_ pKVM, as Fuad was saying. > > I had, probably incorrectly, assumed that we'd eventually want to allow > gmem for all VMs, including traditional KVM VMs that don't have anything > special. Perhaps the gmem support could be exposed via a KVM_CAP in this > case? > > Anyway, no objection to the proposed approach in this patch assuming we > will eventually have HW_PROTECTED_VM for pKVM VMs, and that _that_ can be > bit 31 :). Thinking about this a bit deeper, I am still wondering what this new SW_PROTECTED VM type is buying us? Given that SW_PROTECTED VMs accept both guest-memfd backed memslots and traditional HVA-backed memslots, we could just make normal KVM guests accept guest-memfd memslots and get the same thing? Is there any reason not to do that instead? Even though SW_PROTECTED VMs are documented as 'unstable', the reality is this is UAPI and you can bet it will end up being relied upon, so I would prefer to have a solid reason for introducing this new VM type. Cheers, Quentin
Hi Quentin, On Fri, 14 Feb 2025 at 11:13, Quentin Perret <qperret@google.com> wrote: > > On Tuesday 11 Feb 2025 at 17:09:20 (+0000), Quentin Perret wrote: > > Hi Patrick, > > > > On Tuesday 11 Feb 2025 at 16:32:31 (+0000), Patrick Roy wrote: > > > I was hoping that SW_PROTECTED_VM will be the VM type that something > > > like Firecracker could use, e.g. an interface to guest_memfd specifically > > > _without_ pKVM, as Fuad was saying. > > > > I had, probably incorrectly, assumed that we'd eventually want to allow > > gmem for all VMs, including traditional KVM VMs that don't have anything > > special. Perhaps the gmem support could be exposed via a KVM_CAP in this > > case? > > > > Anyway, no objection to the proposed approach in this patch assuming we > > will eventually have HW_PROTECTED_VM for pKVM VMs, and that _that_ can be > > bit 31 :). > > Thinking about this a bit deeper, I am still wondering what this new > SW_PROTECTED VM type is buying us? Given that SW_PROTECTED VMs accept > both guest-memfd backed memslots and traditional HVA-backed memslots, we > could just make normal KVM guests accept guest-memfd memslots and get > the same thing? Is there any reason not to do that instead? Even though > SW_PROTECTED VMs are documented as 'unstable', the reality is this is > UAPI and you can bet it will end up being relied upon, so I would prefer > to have a solid reason for introducing this new VM type. The more I think about it, I agree with you. I think that reasonable behavior (for kvm/arm64) would be to allow using guest_memfd with all VM types. If the VM type is a non-protected type, then its memory is considered shared by default and is mappable --- as long as the kconfig option is enabled. If VM is protected then the memory is not shared by default. What do you think Patrick? Do you need an explicit VM type? Cheers, /fuad > Cheers, > Quentin
On Fri, 2025-02-14 at 11:33 +0000, Fuad Tabba wrote: > Hi Quentin, > > On Fri, 14 Feb 2025 at 11:13, Quentin Perret <qperret@google.com> wrote: >> >> On Tuesday 11 Feb 2025 at 17:09:20 (+0000), Quentin Perret wrote: >>> Hi Patrick, >>> >>> On Tuesday 11 Feb 2025 at 16:32:31 (+0000), Patrick Roy wrote: >>>> I was hoping that SW_PROTECTED_VM will be the VM type that something >>>> like Firecracker could use, e.g. an interface to guest_memfd specifically >>>> _without_ pKVM, as Fuad was saying. >>> >>> I had, probably incorrectly, assumed that we'd eventually want to allow >>> gmem for all VMs, including traditional KVM VMs that don't have anything >>> special. Perhaps the gmem support could be exposed via a KVM_CAP in this >>> case? >>> >>> Anyway, no objection to the proposed approach in this patch assuming we >>> will eventually have HW_PROTECTED_VM for pKVM VMs, and that _that_ can be >>> bit 31 :). >> >> Thinking about this a bit deeper, I am still wondering what this new >> SW_PROTECTED VM type is buying us? Given that SW_PROTECTED VMs accept >> both guest-memfd backed memslots and traditional HVA-backed memslots, we >> could just make normal KVM guests accept guest-memfd memslots and get >> the same thing? Is there any reason not to do that instead? Even though >> SW_PROTECTED VMs are documented as 'unstable', the reality is this is >> UAPI and you can bet it will end up being relied upon, so I would prefer >> to have a solid reason for introducing this new VM type. > > The more I think about it, I agree with you. I think that reasonable > behavior (for kvm/arm64) would be to allow using guest_memfd with all > VM types. If the VM type is a non-protected type, then its memory is > considered shared by default and is mappable --- as long as the > kconfig option is enabled. If VM is protected then the memory is not > shared by default. > > What do you think Patrick? Do you need an explicit VM type? Mhh, no, if "normal" VMs support guest_memfd, then that works too. I suggested the VM type because that's how x86 works (KVM_X86_SW_PROTECTED_VM), but never actually stopped to think about whether it makes sense for ARM. Maybe Sean knows something we're missing? I wonder whether having the "default sharedness" depend on the vm type works out though - whether a range of gmem is shared or private is a property of the guest_memfd instance, not the VM it's attached to, so I guess the default behavior needs to be based solely on the guest_memfd as well (and then if someone tries to attach a gmem to a VM whose desire of protection doesnt match the guest_memfd's configuration, that operation would fail)? Tangentially related, does KVM_GMEM_SHARED to you mean "guest_memfd also supports shared sections", or "guest_memfd does not support private memory anymore"? (the difference being that in the former, then KVM_GMEM_SHARED would later get the ability to convert ranges private, and the EOPNOSUPP is just a transient state until conversion support is merged) - doesnt matter for my usecase, but I got curious as some other threads implied the second option to me and I ended up wondering why. Best, Patrick > Cheers, > /fuad > >> Cheers, >> Quentin
Hi Patrick, On Fri, 14 Feb 2025 at 12:37, Patrick Roy <roypat@amazon.co.uk> wrote: > > > > On Fri, 2025-02-14 at 11:33 +0000, Fuad Tabba wrote: > > Hi Quentin, > > > > On Fri, 14 Feb 2025 at 11:13, Quentin Perret <qperret@google.com> wrote: > >> > >> On Tuesday 11 Feb 2025 at 17:09:20 (+0000), Quentin Perret wrote: > >>> Hi Patrick, > >>> > >>> On Tuesday 11 Feb 2025 at 16:32:31 (+0000), Patrick Roy wrote: > >>>> I was hoping that SW_PROTECTED_VM will be the VM type that something > >>>> like Firecracker could use, e.g. an interface to guest_memfd specifically > >>>> _without_ pKVM, as Fuad was saying. > >>> > >>> I had, probably incorrectly, assumed that we'd eventually want to allow > >>> gmem for all VMs, including traditional KVM VMs that don't have anything > >>> special. Perhaps the gmem support could be exposed via a KVM_CAP in this > >>> case? > >>> > >>> Anyway, no objection to the proposed approach in this patch assuming we > >>> will eventually have HW_PROTECTED_VM for pKVM VMs, and that _that_ can be > >>> bit 31 :). > >> > >> Thinking about this a bit deeper, I am still wondering what this new > >> SW_PROTECTED VM type is buying us? Given that SW_PROTECTED VMs accept > >> both guest-memfd backed memslots and traditional HVA-backed memslots, we > >> could just make normal KVM guests accept guest-memfd memslots and get > >> the same thing? Is there any reason not to do that instead? Even though > >> SW_PROTECTED VMs are documented as 'unstable', the reality is this is > >> UAPI and you can bet it will end up being relied upon, so I would prefer > >> to have a solid reason for introducing this new VM type. > > > > The more I think about it, I agree with you. I think that reasonable > > behavior (for kvm/arm64) would be to allow using guest_memfd with all > > VM types. If the VM type is a non-protected type, then its memory is > > considered shared by default and is mappable --- as long as the > > kconfig option is enabled. If VM is protected then the memory is not > > shared by default. > > > > What do you think Patrick? Do you need an explicit VM type? > > Mhh, no, if "normal" VMs support guest_memfd, then that works too. I > suggested the VM type because that's how x86 works > (KVM_X86_SW_PROTECTED_VM), but never actually stopped to think about > whether it makes sense for ARM. Maybe Sean knows something we're missing? > > I wonder whether having the "default sharedness" depend on the vm type > works out though - whether a range of gmem is shared or private is a > property of the guest_memfd instance, not the VM it's attached to, so I > guess the default behavior needs to be based solely on the guest_memfd > as well (and then if someone tries to attach a gmem to a VM whose desire > of protection doesnt match the guest_memfd's configuration, that > operation would fail)? Each guest_memfd is associated with a KVM instance. Although it could migrate, it would be weird for a guest_memfd instance to migrate between different types of VM, or at least, migrate between VMs that have different confidentiality requirements. > Tangentially related, does KVM_GMEM_SHARED to you mean "guest_memfd also > supports shared sections", or "guest_memfd does not support private > memory anymore"? (the difference being that in the former, then > KVM_GMEM_SHARED would later get the ability to convert ranges private, > and the EOPNOSUPP is just a transient state until conversion support is > merged) - doesnt matter for my usecase, but I got curious as some other > threads implied the second option to me and I ended up wondering why. My thinking (and implementation in the other patch series) is that KVM_GMEM_SHARED (back then called KVM_GMEM_MAPPABLE) allows sharing in place/mapping, without adding restrictions. Cheers, /fuad > Best, > Patrick > > > Cheers, > > /fuad > > > >> Cheers, > >> Quentin
On Fri, 2025-02-14 at 13:11 +0000, Fuad Tabba wrote: > Hi Patrick, > > On Fri, 14 Feb 2025 at 12:37, Patrick Roy <roypat@amazon.co.uk> wrote: >> >> >> >> On Fri, 2025-02-14 at 11:33 +0000, Fuad Tabba wrote: >>> Hi Quentin, >>> >>> On Fri, 14 Feb 2025 at 11:13, Quentin Perret <qperret@google.com> wrote: >>>> >>>> On Tuesday 11 Feb 2025 at 17:09:20 (+0000), Quentin Perret wrote: >>>>> Hi Patrick, >>>>> >>>>> On Tuesday 11 Feb 2025 at 16:32:31 (+0000), Patrick Roy wrote: >>>>>> I was hoping that SW_PROTECTED_VM will be the VM type that something >>>>>> like Firecracker could use, e.g. an interface to guest_memfd specifically >>>>>> _without_ pKVM, as Fuad was saying. >>>>> >>>>> I had, probably incorrectly, assumed that we'd eventually want to allow >>>>> gmem for all VMs, including traditional KVM VMs that don't have anything >>>>> special. Perhaps the gmem support could be exposed via a KVM_CAP in this >>>>> case? >>>>> >>>>> Anyway, no objection to the proposed approach in this patch assuming we >>>>> will eventually have HW_PROTECTED_VM for pKVM VMs, and that _that_ can be >>>>> bit 31 :). >>>> >>>> Thinking about this a bit deeper, I am still wondering what this new >>>> SW_PROTECTED VM type is buying us? Given that SW_PROTECTED VMs accept >>>> both guest-memfd backed memslots and traditional HVA-backed memslots, we >>>> could just make normal KVM guests accept guest-memfd memslots and get >>>> the same thing? Is there any reason not to do that instead? Even though >>>> SW_PROTECTED VMs are documented as 'unstable', the reality is this is >>>> UAPI and you can bet it will end up being relied upon, so I would prefer >>>> to have a solid reason for introducing this new VM type. >>> >>> The more I think about it, I agree with you. I think that reasonable >>> behavior (for kvm/arm64) would be to allow using guest_memfd with all >>> VM types. If the VM type is a non-protected type, then its memory is >>> considered shared by default and is mappable --- as long as the >>> kconfig option is enabled. If VM is protected then the memory is not >>> shared by default. >>> >>> What do you think Patrick? Do you need an explicit VM type? >> >> Mhh, no, if "normal" VMs support guest_memfd, then that works too. I >> suggested the VM type because that's how x86 works >> (KVM_X86_SW_PROTECTED_VM), but never actually stopped to think about >> whether it makes sense for ARM. Maybe Sean knows something we're missing? >> >> I wonder whether having the "default sharedness" depend on the vm type >> works out though - whether a range of gmem is shared or private is a >> property of the guest_memfd instance, not the VM it's attached to, so I >> guess the default behavior needs to be based solely on the guest_memfd >> as well (and then if someone tries to attach a gmem to a VM whose desire >> of protection doesnt match the guest_memfd's configuration, that >> operation would fail)? > > Each guest_memfd is associated with a KVM instance. Although it could > migrate, it would be weird for a guest_memfd instance to migrate > between different types of VM, or at least, migrate between VMs that > have different confidentiality requirements. Ahh, right, I keep forgetting that CREATE_GUEST_MEMFD() is a vm ioctl. My bad, sorry! >> Tangentially related, does KVM_GMEM_SHARED to you mean "guest_memfd also >> supports shared sections", or "guest_memfd does not support private >> memory anymore"? (the difference being that in the former, then >> KVM_GMEM_SHARED would later get the ability to convert ranges private, >> and the EOPNOSUPP is just a transient state until conversion support is >> merged) - doesnt matter for my usecase, but I got curious as some other >> threads implied the second option to me and I ended up wondering why. > > My thinking (and implementation in the other patch series) is that > KVM_GMEM_SHARED (back then called KVM_GMEM_MAPPABLE) allows sharing in > place/mapping, without adding restrictions. That makes sense to me, thanks for the explanation! > Cheers, > /fuad > >> Best, >> Patrick >> >>> Cheers, >>> /fuad >>> >>>> Cheers, >>>> Quentin
On Fri, Feb 14, 2025, Patrick Roy wrote: > On Fri, 2025-02-14 at 13:11 +0000, Fuad Tabba wrote: > > On Fri, 14 Feb 2025 at 12:37, Patrick Roy <roypat@amazon.co.uk> wrote: > >> On Fri, 2025-02-14 at 11:33 +0000, Fuad Tabba wrote: > >>> Hi Quentin, > >>> > >>> On Fri, 14 Feb 2025 at 11:13, Quentin Perret <qperret@google.com> wrote: > >>>> > >>>> On Tuesday 11 Feb 2025 at 17:09:20 (+0000), Quentin Perret wrote: > >>>>> Hi Patrick, > >>>>> > >>>>> On Tuesday 11 Feb 2025 at 16:32:31 (+0000), Patrick Roy wrote: > >>>>>> I was hoping that SW_PROTECTED_VM will be the VM type that something > >>>>>> like Firecracker could use, e.g. an interface to guest_memfd specifically > >>>>>> _without_ pKVM, as Fuad was saying. > >>>>> > >>>>> I had, probably incorrectly, assumed that we'd eventually want to allow > >>>>> gmem for all VMs, including traditional KVM VMs that don't have anything > >>>>> special. Perhaps the gmem support could be exposed via a KVM_CAP in this > >>>>> case? > >>>>> > >>>>> Anyway, no objection to the proposed approach in this patch assuming we > >>>>> will eventually have HW_PROTECTED_VM for pKVM VMs, and that _that_ can be > >>>>> bit 31 :). > >>>> > >>>> Thinking about this a bit deeper, I am still wondering what this new > >>>> SW_PROTECTED VM type is buying us? Given that SW_PROTECTED VMs accept > >>>> both guest-memfd backed memslots and traditional HVA-backed memslots, we > >>>> could just make normal KVM guests accept guest-memfd memslots and get > >>>> the same thing? Is there any reason not to do that instead? Once guest_memfd can be mmap()'d, no. KVM_X86_SW_PROTECTED_VM was added for testing and development of guest_memfd largely because KVM can't support a "real" VM if KVM can't read/write guest memory through its normal mechanisms. The gap is most apparent on x86, but it holds true for arm64 as well. > >>>> Even though SW_PROTECTED VMs are documented as 'unstable', the reality > >>>> is this is UAPI and you can bet it will end up being relied upon, so I > >>>> would prefer to have a solid reason for introducing this new VM type. > >>> > >>> The more I think about it, I agree with you. I think that reasonable > >>> behavior (for kvm/arm64) would be to allow using guest_memfd with all > >>> VM types. If the VM type is a non-protected type, then its memory is > >>> considered shared by default and is mappable --- as long as the > >>> kconfig option is enabled. If VM is protected then the memory is not > >>> shared by default. This aligns with what I see happening for x86, except that for non-protected VMs there will be no shared vs. private, because such VMs won't have a concept of private memory.
diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 2b52eb77e29c..0fccee4feee7 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -214,6 +214,11 @@ exposed by the guest CPUs in ID_AA64MMFR0_EL1[PARange]. It only affects size of the address translated by the stage2 level (guest physical to host physical address translations). +KVM_VM_TYPE_ARM_SW_PROTECTED is currently only for development and testing of +confidential VMs without having underlying support. Do not use +KVM_VM_TYPE_ARM_SW_PROTECTED for "real" VMs, and especially not in production. +The behavior and effective ABI for software-protected VMs is unstable. + 4.3 KVM_GET_MSR_INDEX_LIST, KVM_GET_MSR_FEATURE_INDEX_LIST ---------------------------------------------------------- diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 7cfa024de4e3..a4276d56f54d 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -383,6 +383,8 @@ struct kvm_arch { * the associated pKVM instance in the hypervisor. */ struct kvm_protected_vm pkvm; + + unsigned long vm_type; }; struct kvm_vcpu_fault_info { @@ -1555,4 +1557,12 @@ void kvm_set_vm_id_reg(struct kvm *kvm, u32 reg, u64 val); #define kvm_has_s1poe(k) \ (kvm_has_feat((k), ID_AA64MMFR3_EL1, S1POE, IMP)) +#define kvm_arch_has_private_mem(kvm) \ + (IS_ENABLED(CONFIG_KVM_PRIVATE_MEM) && \ + ((kvm)->arch.vm_type & KVM_VM_TYPE_ARM_SW_PROTECTED)) + +#define kvm_arch_gmem_supports_shared_mem(kvm) \ + (IS_ENABLED(CONFIG_KVM_GMEM_SHARED_MEM) && \ + ((kvm)->arch.vm_type & KVM_VM_TYPE_ARM_SW_PROTECTED)) + #endif /* __ARM64_KVM_HOST_H__ */ diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 071a7d75be68..a2066db52ada 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -146,6 +146,9 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) { int ret; + if (type & ~KVM_VM_TYPE_MASK) + return -EINVAL; + mutex_init(&kvm->arch.config_lock); #ifdef CONFIG_LOCKDEP @@ -187,6 +190,8 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) bitmap_zero(kvm->arch.vcpu_features, KVM_VCPU_MAX_FEATURES); + kvm->arch.vm_type = type; + return 0; err_free_cpumask: diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 305060518766..b89649d31127 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -882,9 +882,6 @@ static int kvm_init_ipa_range(struct kvm_s2_mmu *mmu, unsigned long type) u64 mmfr0, mmfr1; u32 phys_shift; - if (type & ~KVM_VM_TYPE_ARM_IPA_SIZE_MASK) - return -EINVAL; - phys_shift = KVM_VM_TYPE_ARM_IPA_SIZE(type); if (is_protected_kvm_enabled()) { phys_shift = kvm_ipa_limit; diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 117937a895da..f155d3781e08 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -652,6 +652,12 @@ struct kvm_enable_cap { #define KVM_VM_TYPE_ARM_IPA_SIZE_MASK 0xffULL #define KVM_VM_TYPE_ARM_IPA_SIZE(x) \ ((x) & KVM_VM_TYPE_ARM_IPA_SIZE_MASK) + +#define KVM_VM_TYPE_ARM_SW_PROTECTED (1UL << 9) + +#define KVM_VM_TYPE_MASK (KVM_VM_TYPE_ARM_IPA_SIZE_MASK | \ + KVM_VM_TYPE_ARM_SW_PROTECTED) + /* * ioctls for /dev/kvm fds: */
Introduce a new virtual machine type, KVM_VM_TYPE_ARM_SW_PROTECTED, to serve as a development and testing vehicle for Confidential (CoCo) VMs, similar to the x86 KVM_X86_SW_PROTECTED_VM type. Initially, this is used to test guest_memfd without needing any underlying protection. Similar to the x86 type, this is currently only for development and testing. Do not use KVM_VM_TYPE_ARM_SW_PROTECTED for "real" VMs, and especially not in production. The behavior and effective ABI for software-protected VMs is unstable. Signed-off-by: Fuad Tabba <tabba@google.com> --- Documentation/virt/kvm/api.rst | 5 +++++ arch/arm64/include/asm/kvm_host.h | 10 ++++++++++ arch/arm64/kvm/arm.c | 5 +++++ arch/arm64/kvm/mmu.c | 3 --- include/uapi/linux/kvm.h | 6 ++++++ 5 files changed, 26 insertions(+), 3 deletions(-)