Message ID | 20240605220504.2941958-2-minipli@grsecurity.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: Reject vCPU IDs above 2^32 | expand |
On Thu, Jun 06, 2024, Mathias Krause wrote: > If, on a 64 bit system, a vCPU ID is provided that has the upper 32 bits > set to a non-zero value, it may get accepted if the truncated to 32 bits > integer value is below KVM_MAX_VCPU_IDS and 'max_vcpus'. This feels very > wrong and triggered the reporting logic of PaX's SIZE_OVERFLOW plugin. > > Instead of silently truncating and accepting such values, pass the full > value to kvm_vm_ioctl_create_vcpu() and make the existing limit checks > return an error. > > Even if this is a userland ABI breaking change, no sane userland could > have ever relied on that behaviour. > > Reported-by: PaX's SIZE_OVERFLOW plugin running on grsecurity's syzkaller > Fixes: 6aa8b732ca01 ("[PATCH] kvm: userspace interface") > Cc: Emese Revfy <re.emese@gmail.com> > Cc: PaX Team <pageexec@freemail.hu> > Signed-off-by: Mathias Krause <minipli@grsecurity.net> > --- > virt/kvm/kvm_main.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > index 14841acb8b95..9f18fc42f018 100644 > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -4200,7 +4200,7 @@ static void kvm_create_vcpu_debugfs(struct kvm_vcpu *vcpu) > /* > * Creates some virtual cpus. Good luck creating more than one. > */ > -static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, u32 id) > +static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, unsigned long id) Hmm, I don't love that KVM subtly relies on the KVM_MAX_VCPU_IDS check to guard against truncation when passing @id to kvm_arch_vcpu_precreate(), kvm_vcpu_init(), etc. I doubt that it will ever be problematic, but it _looks_ like a bug. If we really care enough to fix this, my vote is for something like so: diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 4965196cad58..08adfdb2817e 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -4200,13 +4200,14 @@ static void kvm_create_vcpu_debugfs(struct kvm_vcpu *vcpu) /* * Creates some virtual cpus. Good luck creating more than one. */ -static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, u32 id) +static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, unsigned long __id) { int r; struct kvm_vcpu *vcpu; struct page *page; + u32 id = __id; - if (id >= KVM_MAX_VCPU_IDS) + if (id != __id || id >= KVM_MAX_VCPU_IDS) return -EINVAL; mutex_lock(&kvm->lock);
On 06.06.24 00:31, Sean Christopherson wrote: > On Thu, Jun 06, 2024, Mathias Krause wrote: >> If, on a 64 bit system, a vCPU ID is provided that has the upper 32 bits >> set to a non-zero value, it may get accepted if the truncated to 32 bits >> integer value is below KVM_MAX_VCPU_IDS and 'max_vcpus'. This feels very >> wrong and triggered the reporting logic of PaX's SIZE_OVERFLOW plugin. >> >> Instead of silently truncating and accepting such values, pass the full >> value to kvm_vm_ioctl_create_vcpu() and make the existing limit checks >> return an error. >> >> Even if this is a userland ABI breaking change, no sane userland could >> have ever relied on that behaviour. >> >> Reported-by: PaX's SIZE_OVERFLOW plugin running on grsecurity's syzkaller >> Fixes: 6aa8b732ca01 ("[PATCH] kvm: userspace interface") >> Cc: Emese Revfy <re.emese@gmail.com> >> Cc: PaX Team <pageexec@freemail.hu> >> Signed-off-by: Mathias Krause <minipli@grsecurity.net> >> --- >> virt/kvm/kvm_main.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c >> index 14841acb8b95..9f18fc42f018 100644 >> --- a/virt/kvm/kvm_main.c >> +++ b/virt/kvm/kvm_main.c >> @@ -4200,7 +4200,7 @@ static void kvm_create_vcpu_debugfs(struct kvm_vcpu *vcpu) >> /* >> * Creates some virtual cpus. Good luck creating more than one. >> */ >> -static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, u32 id) >> +static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, unsigned long id) > > Hmm, I don't love that KVM subtly relies on the KVM_MAX_VCPU_IDS check to guard > against truncation when passing @id to kvm_arch_vcpu_precreate(), kvm_vcpu_init(), > etc. I doubt that it will ever be problematic, but it _looks_ like a bug. It's not subtle but very explicit. KVM_MAX_VCPU_IDS is a small positive number, depending on some arch specific #define, but with x86 allowing for the largest value of 4 * 4096. That value, for sure, cannot exceed U32_MAX, so an explicit truncation isn't needed as the upper bits will already be zero if the limit check passes. While subtile integer truncation is the bug that my patch is actually fixing, it is for the *userland* facing part of it, as in clarifying the ABI to work on "machine-sized words", i.e. a ulong, and doing the limit checks on these. *In-kernel* APIs truncate / sign extend / mix signed/unsigned values all the time. The kernel is full of these. Trying to "fix" them all is an uphill battle not worth fighting, imho. > > If we really care enough to fix this, my vote is for something like so: > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > index 4965196cad58..08adfdb2817e 100644 > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -4200,13 +4200,14 @@ static void kvm_create_vcpu_debugfs(struct kvm_vcpu *vcpu) > /* > * Creates some virtual cpus. Good luck creating more than one. > */ > -static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, u32 id) > +static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, unsigned long __id) > { > int r; > struct kvm_vcpu *vcpu; > struct page *page; > + u32 id = __id; > > - if (id >= KVM_MAX_VCPU_IDS) > + if (id != __id || id >= KVM_MAX_VCPU_IDS) > return -EINVAL; > > mutex_lock(&kvm->lock); I'd rather suggest to add a build time assert instead, as the existing runtime check is sufficient (with my u32->ulong change). Something like this: --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -4200,12 +4200,13 @@ static void kvm_create_vcpu_debugfs(struct kvm_vcpu *vcpu) /* * Creates some virtual cpus. Good luck creating more than one. */ -static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, u32 id) +static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, unsigned long id) { int r; struct kvm_vcpu *vcpu; struct page *page; + BUILD_BUG_ON(KVM_MAX_VCPU_IDS > INT_MAX); if (id >= KVM_MAX_VCPU_IDS) return -EINVAL; Thanks, Mathias
On Thu, Jun 06, 2024, Mathias Krause wrote: > On 06.06.24 00:31, Sean Christopherson wrote: > > On Thu, Jun 06, 2024, Mathias Krause wrote: > >> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > >> index 14841acb8b95..9f18fc42f018 100644 > >> --- a/virt/kvm/kvm_main.c > >> +++ b/virt/kvm/kvm_main.c > >> @@ -4200,7 +4200,7 @@ static void kvm_create_vcpu_debugfs(struct kvm_vcpu *vcpu) > >> /* > >> * Creates some virtual cpus. Good luck creating more than one. > >> */ > >> -static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, u32 id) > >> +static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, unsigned long id) > > > > Hmm, I don't love that KVM subtly relies on the KVM_MAX_VCPU_IDS check to guard > > against truncation when passing @id to kvm_arch_vcpu_precreate(), kvm_vcpu_init(), > > etc. I doubt that it will ever be problematic, but it _looks_ like a bug. > > It's not subtle but very explicit. KVM_MAX_VCPU_IDS is a small positive > number, depending on some arch specific #define, but with x86 allowing > for the largest value of 4 * 4096. That value, for sure, cannot exceed > U32_MAX, so an explicit truncation isn't needed as the upper bits will > already be zero if the limit check passes. > > While subtile integer truncation is the bug that my patch is actually > fixing, it is for the *userland* facing part of it, as in clarifying the > ABI to work on "machine-sized words", i.e. a ulong, and doing the limit > checks on these. > > *In-kernel* APIs truncate / sign extend / mix signed/unsigned values all > the time. The kernel is full of these. Trying to "fix" them all is an > uphill battle not worth fighting, imho. Oh, I'm not worry about something going wrong with the actual truncation. What I don't like is the primary in-kernal API, kvm_vm_ioctl_create_vcpu(), taking an unsigned long, but everything underneath converting that to an unsigned int, without much of anything to give the reader a clue that the truncation is deliberate. Similarly, without the context of the changelog, it's not at all obvious why kvm_vm_ioctl_create_vcpu() takes an unsigned long. E.g. x86 has another potentially more restrictive check on @id, and it looks quite odd to check @id against KVM_MAX_VCPU_IDS as an "unsigned long" in flow flow, but as an "unsigned int" in another. int kvm_arch_vcpu_precreate(struct kvm *kvm, unsigned int id) { if (kvm_check_tsc_unstable() && kvm->created_vcpus) pr_warn_once("SMP vm created on host with unstable TSC; " "guest TSC will not be reliable\n"); if (!kvm->arch.max_vcpu_ids) kvm->arch.max_vcpu_ids = KVM_MAX_VCPU_IDS; if (id >= kvm->arch.max_vcpu_ids) return -EINVAL; > I'd rather suggest to add a build time assert instead, as the existing > runtime check is sufficient (with my u32->ulong change). Something like > this: > > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -4200,12 +4200,13 @@ static void kvm_create_vcpu_debugfs(struct > kvm_vcpu *vcpu) > /* > * Creates some virtual cpus. Good luck creating more than one. > */ > -static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, u32 id) > +static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, unsigned long id) > { > int r; > struct kvm_vcpu *vcpu; > struct page *page; > > + BUILD_BUG_ON(KVM_MAX_VCPU_IDS > INT_MAX); This should be UINT_MAX, no? Regardless, the "need" for an explicit BUILD_BUG_ON() is another reason I dislike relying on the KVM_MAX_VCPU_IDS check to detect truncation. If @id is checked as a 32-bit value, and we somehow screw up and define KVM_MAX_VCPU_IDS to be a 64-bit value, clang will rightly complain that the check is useless, e.g. given "#define KVM_MAX_VCPU_ID_TEST BIT(32)" arch/x86/kvm/x86.c:12171:9: error: result of comparison of constant 4294967296 with expression of type 'unsigned int' is always false [-Werror,-Wtautological-constant-out-of-range-compare] if (id > KVM_MAX_VCPU_ID_TEST) ~~ ^ ~~~~~~~~~~~~~~~~~~~~ 1 error generated. > if (id >= KVM_MAX_VCPU_IDS) > return -EINVAL; What if we do an explicit check before calling kvm_vm_ioctl_create_vcpu()? That would avoid the weird __id param, and provide a convenient location to document exactly why KVM checks for truncation. We could also move the "if (id >= KVM_MAX_VCPU_IDS)" check to kvm_vm_ioctl(), but I don't love that, because again IMO it makes the code less readable overall, loses clang's tuautological constant check, and the cost of the extra check against UINT_MAX is completely negligible. Though if I had to choose, I'd prefer moving the check to kvm_vm_ioctl() over taking an "unsigned long" in kvm_vm_ioctl_create_vcpu(). diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 4965196cad58..8155146b16cd 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -5083,6 +5083,13 @@ static long kvm_vm_ioctl(struct file *filp, return -EIO; switch (ioctl) { case KVM_CREATE_VCPU: + /* + * KVM tracks vCPU ID as a 32-bit value, be kind to userspace + * and reject too-large values instead of silently truncating. + */ + if (arg > UINT_MAX) + return -EINVAL; + r = kvm_vm_ioctl_create_vcpu(kvm, arg); break; case KVM_ENABLE_CAP: {
On 06.06.24 16:55, Sean Christopherson wrote: > On Thu, Jun 06, 2024, Mathias Krause wrote: >> On 06.06.24 00:31, Sean Christopherson wrote: >>> On Thu, Jun 06, 2024, Mathias Krause wrote: >>>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c >>>> index 14841acb8b95..9f18fc42f018 100644 >>>> --- a/virt/kvm/kvm_main.c >>>> +++ b/virt/kvm/kvm_main.c >>>> @@ -4200,7 +4200,7 @@ static void kvm_create_vcpu_debugfs(struct kvm_vcpu *vcpu) >>>> /* >>>> * Creates some virtual cpus. Good luck creating more than one. >>>> */ >>>> -static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, u32 id) >>>> +static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, unsigned long id) >>> >>> Hmm, I don't love that KVM subtly relies on the KVM_MAX_VCPU_IDS check to guard >>> against truncation when passing @id to kvm_arch_vcpu_precreate(), kvm_vcpu_init(), >>> etc. I doubt that it will ever be problematic, but it _looks_ like a bug. >> >> It's not subtle but very explicit. KVM_MAX_VCPU_IDS is a small positive >> number, depending on some arch specific #define, but with x86 allowing >> for the largest value of 4 * 4096. That value, for sure, cannot exceed >> U32_MAX, so an explicit truncation isn't needed as the upper bits will >> already be zero if the limit check passes. >> >> While subtile integer truncation is the bug that my patch is actually >> fixing, it is for the *userland* facing part of it, as in clarifying the >> ABI to work on "machine-sized words", i.e. a ulong, and doing the limit >> checks on these. >> >> *In-kernel* APIs truncate / sign extend / mix signed/unsigned values all >> the time. The kernel is full of these. Trying to "fix" them all is an >> uphill battle not worth fighting, imho. > > Oh, I'm not worry about something going wrong with the actual truncation. Well, I do ;) > > What I don't like is the primary in-kernal API, kvm_vm_ioctl_create_vcpu(), taking > an unsigned long, but everything underneath converting that to an unsigned int, > without much of anything to give the reader a clue that the truncation is > deliberate. Well, again, it's clear to me, at least. kvm_vm_ioctl_create_vcpu() is the barrier from converting a userland provided "raw" value and checking it to be within bounds that are sensible for in-kernel use. After that check it's fine to use a more narrow type that still fits these bounds and use that for in-kernel use. The first part is _completely_ handled by the 'id >= KVM_MAX_VCPU_IDS' test, as 'id' is still the "raw" value userland provided. Testing it against KVM_MAX_VCPU_IDS does the "sensible for in-kernel use" check and, on passing that check, allows to limit the storage type for 'id' to the bounds of [0,KVM_MAX_VCPU_IDS) which fits within u32, unsigned int or, as actually used for the vcpu_id member, an int.¹ > > Similarly, without the context of the changelog, it's not at all obvious why > kvm_vm_ioctl_create_vcpu() takes an unsigned long. Well, it's the ioctl() entry path. Passing on the UABI value shouldn't be all that surprising. But I agree, a comment explaining the type- and value handling to avoid early truncation thereof may not be such a bad idea. > > E.g. x86 has another potentially more restrictive check on @id, and it looks > quite odd to check @id against KVM_MAX_VCPU_IDS as an "unsigned long" in flow > flow, but as an "unsigned int" in another. Again, that's two distinct things, even if looking similar. The first check against KVM_MAX_VCPU_IDS does actually two things: 1/ Ensure the full user ABI provided value (ulong) is sane and 2/ ensure it's within the hard limits KVM expects (fits unsigned int). Now we do both with only a single compare and maybe that's what's so hard to grasp -- that a single check can do both things. But why not make use of simple things when we can do so? Regarding x86's kvm_arch_vcpu_precreate(), it can use an 'unsigned int' because the KVM_MAX_VCPU_IDS check ensured that the value of 'id' cannot exceed that (limited) type. It's after the "UABI raw value" to "fits kernel-internal type" conversion has happened, so using the narrower type is just fine. > > int kvm_arch_vcpu_precreate(struct kvm *kvm, unsigned int id) > { > if (kvm_check_tsc_unstable() && kvm->created_vcpus) > pr_warn_once("SMP vm created on host with unstable TSC; " > "guest TSC will not be reliable\n"); > > if (!kvm->arch.max_vcpu_ids) > kvm->arch.max_vcpu_ids = KVM_MAX_VCPU_IDS; > > if (id >= kvm->arch.max_vcpu_ids) > return -EINVAL; > Above is completely fine, as this code only executes after the narrower type bounds check. >> I'd rather suggest to add a build time assert instead, as the existing >> runtime check is sufficient (with my u32->ulong change). Something like >> this: >> >> --- a/virt/kvm/kvm_main.c >> +++ b/virt/kvm/kvm_main.c >> @@ -4200,12 +4200,13 @@ static void kvm_create_vcpu_debugfs(struct >> kvm_vcpu *vcpu) >> /* >> * Creates some virtual cpus. Good luck creating more than one. >> */ >> -static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, u32 id) >> +static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, unsigned long id) >> { >> int r; >> struct kvm_vcpu *vcpu; >> struct page *page; >> >> + BUILD_BUG_ON(KVM_MAX_VCPU_IDS > INT_MAX); > > This should be UINT_MAX, no? No, I chose INT_MAX very intentional, as the underlying type of 'vcpu_id' is actually an int. Trying to store a value that's greater than INT_MAX (but smaller than UINT_MAX) will make it negative and that's definitely something we need to prevent as otherwise array indexed accesses like in the IOAPIC code would try to access memory before the allocated storage. Not good! > Regardless, the "need" for an explicit BUILD_BUG_ON() > is another reason I dislike relying on the KVM_MAX_VCPU_IDS check to detect > truncation. There's no "need" for the BUILD_BUG_ON(). It's just a cheap (compile time only, no runtime "overhead") assert that the code won't allow truncated values which may lead to follow-up bugs because of unintended truncation. And, after all, you suggested something like that (a truncation check) yourself. I just tried to provide it as something that doesn't need the odd '__id' argument and an explicit truncation check which would do the wrong thing if we would like to push KVM_MAX_VCPU_IDS above UINT_MAX (failing only at runtime, not at compile time). > If @id is checked as a 32-bit value, and we somehow screw up and > define KVM_MAX_VCPU_IDS to be a 64-bit value, clang will rightly complain that > the check is useless, e.g. given "#define KVM_MAX_VCPU_ID_TEST BIT(32)" > > arch/x86/kvm/x86.c:12171:9: error: result of comparison of constant 4294967296 with > expression of type 'unsigned int' is always false [-Werror,-Wtautological-constant-out-of-range-compare] > if (id > KVM_MAX_VCPU_ID_TEST) > ~~ ^ ~~~~~~~~~~~~~~~~~~~~ > 1 error generated. ^^^^^^^^^^^^^^^^^^ Perfect! So this breaks the build. How much better can we prevent this bug from going unnoticed? Same for the BUILD_BUG_ON(). I don't see how it is somehow hiding things or making future changes silently fail. They won't. Neither will the x86 specific code compile, nor will the BUILD_BUG_ON() in kvm_vm_ioctl_create_vcpu(). So what you're trying to say?... > > >> if (id >= KVM_MAX_VCPU_IDS) >> return -EINVAL; > > What if we do an explicit check before calling kvm_vm_ioctl_create_vcpu()? That > would avoid the weird __id param, and provide a convenient location to document > exactly why KVM checks for truncation. My version has no "weird __id param" ;) And, honestly, the current u32 argument type makes no sense either. It's neither the final type used for 'vcpu_id' nor does it have to be explicitly 32 bits. So, IMHO, the argument type should be fixed in any case and using the type that preserves the UABI value just seems like a natural fit. kvm_vm_ioctl() is "only" the command multiplexer, deferring concrete implementation to the individual functions. It currently does no input value sanity checks itself, beside some user copy error checks. So why should KVM_CREATE_VCPU be special and have its truncation check be outside of the handler function? It just scatters the code and hurts readability, IMHO. > > We could also move the "if (id >= KVM_MAX_VCPU_IDS)" check to kvm_vm_ioctl(), > but I don't love that, because again IMO it makes the code less readable overall, > loses clang's tuautological constant check, and the cost of the extra check against > UINT_MAX is completely negligible. > > Though if I had to choose, I'd prefer moving the check to kvm_vm_ioctl() over > taking an "unsigned long" in kvm_vm_ioctl_create_vcpu(). Looks like we disagree again. I'd always choose the ulong argument over scattering the checks over multiple functions and possibly missing it if a new call site appears. > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > index 4965196cad58..8155146b16cd 100644 > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -5083,6 +5083,13 @@ static long kvm_vm_ioctl(struct file *filp, > return -EIO; > switch (ioctl) { > case KVM_CREATE_VCPU: > + /* > + * KVM tracks vCPU ID as a 32-bit value, be kind to userspace > + * and reject too-large values instead of silently truncating. > + */ > + if (arg > UINT_MAX) > + return -EINVAL; Unfortunately, that check is a tautology for 32 bit systems and wrong for reasons I explained in the beginning of my Email. But we can move the comment to kvm_vm_ioctl_create_vcpu() and explain the rational over there (with taking an 'unsigned long id' argument): diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 14841acb8b95..b04e87f6568f 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -4200,12 +4200,20 @@ static void kvm_create_vcpu_debugfs(struct kvm_vcpu *vcpu) /* * Creates some virtual cpus. Good luck creating more than one. */ -static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, u32 id) +static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, unsigned long id) { int r; struct kvm_vcpu *vcpu; struct page *page; + /* + * KVM tracks vCPU IDs as 'int', be kind to userspace and reject + * too-large values instead of silently truncating. + * + * Also ensure we're not breaking this assumption by accidentally + * pushing KVM_MAX_VCPU_IDS above INT_MAX. + */ + BUILD_BUG_ON(KVM_MAX_VCPU_IDS > INT_MAX); if (id >= KVM_MAX_VCPU_IDS) return -EINVAL; > + > r = kvm_vm_ioctl_create_vcpu(kvm, arg); > break; > case KVM_ENABLE_CAP: { Cheers, Mathias ¹ IMHO, using 'int' for vcpu_id is actually *very* *wrong*, as it's used as an index in certain constructs and having a signed type doesn't feel right at all. But that's just a side matter, as, according to the checks on the ioctl() path, the actual value of vcpu_id can never be negative. So lets not distract.
On 06.06.24 20:11, Mathias Krause wrote: > + /* > + * KVM tracks vCPU IDs as 'int', be kind to userspace and reject > + * too-large values instead of silently truncating. > + * > + * Also ensure we're not breaking this assumption by accidentally > + * pushing KVM_MAX_VCPU_IDS above INT_MAX. > + */ > + BUILD_BUG_ON(KVM_MAX_VCPU_IDS > INT_MAX); And yes, I'd rather like to write the above as something like below to account for type changes: BUILD_BUG_ON(KVM_MAX_VCPU_IDS > typeof(vcpu->vcpu_id)::max()); But, unfortunately, C is not there yet. One could try to implement something based on _Generic() but, meh.. > if (id >= KVM_MAX_VCPU_IDS) > return -EINVAL; >
On Thu, Jun 06, 2024, Mathias Krause wrote: > On 06.06.24 16:55, Sean Christopherson wrote: > > On Thu, Jun 06, 2024, Mathias Krause wrote: > The first part is _completely_ handled by the 'id >= KVM_MAX_VCPU_IDS' > test, as 'id' is still the "raw" value userland provided. I'm not arguing it doesn't, all I'm saying is that I don't like overloading a check against an arbitrary limit to also protect against truncation issues. > > E.g. x86 has another potentially more restrictive check on @id, and it looks > > quite odd to check @id against KVM_MAX_VCPU_IDS as an "unsigned long" in flow > > flow, but as an "unsigned int" in another. > > Again, that's two distinct things, even if looking similar. The first > check against KVM_MAX_VCPU_IDS does actually two things: > > 1/ Ensure the full user ABI provided value (ulong) is sane and > 2/ ensure it's within the hard limits KVM expects (fits unsigned int). > > Now we do both with only a single compare and maybe that's what's so > hard to grasp -- that a single check can do both things. But why not > make use of simple things when we can do so? Because it's unnecessarily clever. I've dealt with waaaay too much legacy KVM code that probably seemed super obvious at the time, but 8+ years and lots of code churn later was completely nonsensical and all but impossible to decipher. That's less likely to happen these days, as we have better tracking via lore, and I like to think we have better changelogs. But I still dislike doing unnecessarily clever things because it tends to set the next generation up to fail. > >> --- a/virt/kvm/kvm_main.c > >> +++ b/virt/kvm/kvm_main.c > >> @@ -4200,12 +4200,13 @@ static void kvm_create_vcpu_debugfs(struct > >> kvm_vcpu *vcpu) > >> /* > >> * Creates some virtual cpus. Good luck creating more than one. > >> */ > >> -static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, u32 id) > >> +static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, unsigned long id) > >> { > >> int r; > >> struct kvm_vcpu *vcpu; > >> struct page *page; > >> > >> + BUILD_BUG_ON(KVM_MAX_VCPU_IDS > INT_MAX); > > > > This should be UINT_MAX, no? > > No, I chose INT_MAX very intentional, as the underlying type of > 'vcpu_id' is actually an int. Oof, I didn't realize (or more likely, simply forgot) that vcpu_id and vcpu_idx are tracked as "int". > There's no "need" for the BUILD_BUG_ON(). It's just a cheap (compile > time only, no runtime "overhead") assert that the code won't allow > truncated values which may lead to follow-up bugs because of unintended > truncation. And, after all, you suggested something like that (a > truncation check) yourself. I just tried to provide it as something that > doesn't need the odd '__id' argument and an explicit truncation check > which would do the wrong thing if we would like to push KVM_MAX_VCPU_IDS > above UINT_MAX (failing only at runtime, not at compile time). Yeah, I know all that. I'm not arguing the actual cost of the code is at all meaningful. I'm purely concerned about the long-term maintenance cost. This is obviously a small thing that is unlikely to ever cause problems, but again, I suspect that past KVM developers said exactly that about things that have pegged my WTF-o-meter. > > If @id is checked as a 32-bit value, and we somehow screw up and > > define KVM_MAX_VCPU_IDS to be a 64-bit value, clang will rightly complain that > > the check is useless, e.g. given "#define KVM_MAX_VCPU_ID_TEST BIT(32)" > > > > arch/x86/kvm/x86.c:12171:9: error: result of comparison of constant 4294967296 with > > expression of type 'unsigned int' is always false [-Werror,-Wtautological-constant-out-of-range-compare] > > if (id > KVM_MAX_VCPU_ID_TEST) > > ~~ ^ ~~~~~~~~~~~~~~~~~~~~ > > 1 error generated. > ^^^^^^^^^^^^^^^^^^ > Perfect! So this breaks the build. How much better can we prevent this > bug from going unnoticed? Yes, but iff @id is a 32-bit value, i.e. this trick doesn't work on 64-bit kernels if the comparison is done with @id is an unsigned long (and I'm hoping that we can kill off 32-bit KVM support in the not too distant future). > ¹ IMHO, using 'int' for vcpu_id is actually *very* *wrong*, as it's used > as an index in certain constructs and having a signed type doesn't feel > right at all. But that's just a side matter, as, according to the checks > on the ioctl() path, the actual value of vcpu_id can never be negative. > So lets not distract. Hmm, I 100% agree that it's horrific, but I disagree that it's a distraction. I think we should fix that at the same time as we harden the trunction stuff, so that it's (hopefully) clear what KVM _intends_ to support, as opposed to what the code happens to allow. In the end, I'm ok relying on the KVM_MAX_VCPU_IDS check, so long as there's a BUILD_BUG_ON() and a comment.
On 07.06.24 02:12, Sean Christopherson wrote: > On Thu, Jun 06, 2024, Mathias Krause wrote: >> On 06.06.24 16:55, Sean Christopherson wrote: >>> On Thu, Jun 06, 2024, Mathias Krause wrote: >>>> [snip] >>> If @id is checked as a 32-bit value, and we somehow screw up and >>> define KVM_MAX_VCPU_IDS to be a 64-bit value, clang will rightly complain that >>> the check is useless, e.g. given "#define KVM_MAX_VCPU_ID_TEST BIT(32)" >>> >>> arch/x86/kvm/x86.c:12171:9: error: result of comparison of constant 4294967296 with >>> expression of type 'unsigned int' is always false [-Werror,-Wtautological-constant-out-of-range-compare] >>> if (id > KVM_MAX_VCPU_ID_TEST) >>> ~~ ^ ~~~~~~~~~~~~~~~~~~~~ >>> 1 error generated. >> ^^^^^^^^^^^^^^^^^^ >> Perfect! So this breaks the build. How much better can we prevent this >> bug from going unnoticed? > > Yes, but iff @id is a 32-bit value, i.e. this trick doesn't work on 64-bit kernels > if the comparison is done with @id is an unsigned long (and I'm hoping that we > can kill off 32-bit KVM support in the not too distant future). Fortunately, the BUILD_BUG_ON(KVM_MAX_VCPU_IDS > INT_MAX) will still catch it. > >> ¹ IMHO, using 'int' for vcpu_id is actually *very* *wrong*, as it's used >> as an index in certain constructs and having a signed type doesn't feel >> right at all. But that's just a side matter, as, according to the checks >> on the ioctl() path, the actual value of vcpu_id can never be negative. >> So lets not distract. > > Hmm, I 100% agree that it's horrific, but I disagree that it's a distraction. > I think we should fix that at the same time as we harden the trunction stuff, so > that it's (hopefully) clear what KVM _intends_ to support, as opposed to what the > code happens to allow. I looked into it a little closer and it's even a bigger mess than what I initially thought. 'vcpu_id' values get passed around as int, unsigned int, u32, u16 even (S390) and unsigned long in the various architectures implementing KVM support. A change touching all of these clearly needs quite some coordination among multiple maintainers and testing to rule out issues caused by changing the sign of the underlying type -- even if it'll be just new warnings popping up. I don't even have cross-compilers for all of these, less so setups to actually do some tests. So, unfortunately, I'm not up to do that change. > > In the end, I'm ok relying on the KVM_MAX_VCPU_IDS check, so long as there's > a BUILD_BUG_ON() and a comment. Ok, will send a v2 series soon, covering KVM_SET_BOOT_CPU_ID, too. Thanks, Mathias
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 14841acb8b95..9f18fc42f018 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -4200,7 +4200,7 @@ static void kvm_create_vcpu_debugfs(struct kvm_vcpu *vcpu) /* * Creates some virtual cpus. Good luck creating more than one. */ -static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, u32 id) +static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, unsigned long id) { int r; struct kvm_vcpu *vcpu;
If, on a 64 bit system, a vCPU ID is provided that has the upper 32 bits set to a non-zero value, it may get accepted if the truncated to 32 bits integer value is below KVM_MAX_VCPU_IDS and 'max_vcpus'. This feels very wrong and triggered the reporting logic of PaX's SIZE_OVERFLOW plugin. Instead of silently truncating and accepting such values, pass the full value to kvm_vm_ioctl_create_vcpu() and make the existing limit checks return an error. Even if this is a userland ABI breaking change, no sane userland could have ever relied on that behaviour. Reported-by: PaX's SIZE_OVERFLOW plugin running on grsecurity's syzkaller Fixes: 6aa8b732ca01 ("[PATCH] kvm: userspace interface") Cc: Emese Revfy <re.emese@gmail.com> Cc: PaX Team <pageexec@freemail.hu> Signed-off-by: Mathias Krause <minipli@grsecurity.net> --- virt/kvm/kvm_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)