Message ID | c1535eba0bba6fc1b91f975f434af0929d9d7c96.1671298923.git.git@neowutran.ovh (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [v1] Bug fix - Integer overflow when cpu frequency > u32 max value. | expand |
On Sat, Dec 17, 2022 at 06:42:05PM +0100, Neowutran wrote: > xen/arch/x86/time.c: Bug fix - Integer overflow when cpu frequency > u32 max value. > > What is was trying to do: I was trying to install QubesOS on my new computer > (AMD zen4 processor). Guest VM were unusably slow / unusable. > > What is the issue: The cpu frequency reported is wrong for linux guest in HVM > and PVH mode, and it cause issue with the TSC clocksource (for example). > > Why this patch solved my issue: > The root cause it that "d->arch.tsc_khz" is a unsigned integer storing > the cpu frequency in khz. It get multiplied by 1000, so if the cpu frequency > is over ~4,294 Mhz (u32 max value), then it overflow. > I am solving the issue by adding an explicit cast to u64 to avoid the overflow. https://wiki.xenproject.org/wiki/Submitting_Xen_Project_Patches You're not matching the instructions. You need to add a "Signed-off-by" to indicate *you* wrote this, own the Copyright and are providing this under the license of the existing files. Your title line is on the long side. The discussion is useful for the message on the xen-devel mailing list, but isn't important for source code history. You should put "---" after the commit message and discussion below. I would suggest as a commit message: --------8<-----------------------------------------------------8<-------- xen/x86: prevent overflow with high frequency TSCs Promote tsc_khz to a 64-bit type before multiplying by 1000. Otherwise just above 4.294GHz the value will overflow. Processors with clocks this high are now in production and require this to work correctly. --------8<-----------------------------------------------------8<-------- Feel free to disagree, though any maintainer is going to want adjustments to your original commit message. Note to Xen and Linux distribution maintainers: I suggest this needs a point release of Xen. A large processor manufacturer has recently released such a processor. A great number of people are going to be rather unhappy in short order without this fix.
On 17/12/2022 5:42 pm, Neowutran wrote: > diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c > index b01acd390d..7c77ec8902 100644 > --- a/xen/arch/x86/time.c > +++ b/xen/arch/x86/time.c > @@ -2585,7 +2585,7 @@ int tsc_set_info(struct domain *d, > case TSC_MODE_ALWAYS_EMULATE: > d->arch.vtsc_offset = get_s_time() - elapsed_nsec; > d->arch.tsc_khz = gtsc_khz ?: cpu_khz; > - set_time_scale(&d->arch.vtsc_to_ns, d->arch.tsc_khz * 1000); > + set_time_scale(&d->arch.vtsc_to_ns, (u64)d->arch.tsc_khz * 1000); Ah - I see you tracked down your bug in the end. One minor thing, we prefer to use (uint64_t) rather than (u64). As Elliot said, you need to submit a v2 with a Signed-off-by for the DCO before the patch can be accepted. A more succinct commit message would also be great - it wants to describe what the bug is, and what the fix is. For this, the key term is "overflow before widen" (or variation thereof) which is a common class of bug in C. Thanks, ~Andrew
On 18.12.2022 01:18, Andrew Cooper wrote: > On 17/12/2022 5:42 pm, Neowutran wrote: >> diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c >> index b01acd390d..7c77ec8902 100644 >> --- a/xen/arch/x86/time.c >> +++ b/xen/arch/x86/time.c >> @@ -2585,7 +2585,7 @@ int tsc_set_info(struct domain *d, >> case TSC_MODE_ALWAYS_EMULATE: >> d->arch.vtsc_offset = get_s_time() - elapsed_nsec; >> d->arch.tsc_khz = gtsc_khz ?: cpu_khz; >> - set_time_scale(&d->arch.vtsc_to_ns, d->arch.tsc_khz * 1000); >> + set_time_scale(&d->arch.vtsc_to_ns, (u64)d->arch.tsc_khz * 1000); > > Ah - I see you tracked down your bug in the end. One minor thing, we > prefer to use (uint64_t) rather than (u64). And yet better we like cast-less code, e.g. in this case by simply using 1000UL. Jan
On 17.12.2022 21:17, Elliott Mitchell wrote: > Note to Xen and Linux distribution maintainers: I suggest this needs a > point release of Xen. A large processor manufacturer has recently > released such a processor. A great number of people are going to be > rather unhappy in short order without this fix. As long as we aren't cutting point releases on every XSA (batch), I have a hard time seeing why we'd make any in this case. The change is going to be backported, sure. Jan
diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c index b01acd390d..7c77ec8902 100644 --- a/xen/arch/x86/time.c +++ b/xen/arch/x86/time.c @@ -2585,7 +2585,7 @@ int tsc_set_info(struct domain *d, case TSC_MODE_ALWAYS_EMULATE: d->arch.vtsc_offset = get_s_time() - elapsed_nsec; d->arch.tsc_khz = gtsc_khz ?: cpu_khz; - set_time_scale(&d->arch.vtsc_to_ns, d->arch.tsc_khz * 1000); + set_time_scale(&d->arch.vtsc_to_ns, (u64)d->arch.tsc_khz * 1000); /* * In default mode use native TSC if the host has safe TSC and