mbox series

[RFC,0/6] vDSO support for Hyper-V guest on ARM64

Message ID 20191216001922.23008-1-boqun.feng@gmail.com (mailing list archive)
Headers show
Series vDSO support for Hyper-V guest on ARM64 | expand

Message

Boqun Feng Dec. 16, 2019, 12:19 a.m. UTC
Hi,

This is the RFC patchset for vDSO support in ARM64 Hyper-V guest. To
test it, Michael's ARM64 support patchset:

	https://lore.kernel.org/linux-arm-kernel/1570129355-16005-1-git-send-email-mikelley@microsoft.com/

is needed.

Similar as x86, Hyper-V on ARM64 use a TSC page for guests to read
the virtualized hardware timer, this TSC page is read-only for the
guests, so could be used for vDSO data page. And the vDSO (userspace)
code could use the same code for timer reading as kernel, since
they read the same TSC page.

This patchset therefore extends ARM64's __vsdo_init() to allow multiple
data pages and introduces the vclock_mode concept similar to x86 to
allow different platforms (bare-metal, Hyper-V, etc.) to switch to
different __arch_get_hw_counter() implementations. The rest of this
patchset does the necessary setup for Hyper-V guests: mapping tsc page,
enabling userspace to read cntvct, etc. to enable vDSO.

This patchset consists of 6 patches:

patch #1 allows hv_get_raw_timer() definition to be overridden for
userspace and kernel to share the same hv_read_tsc_page() definition.

patch #2 extends ARM64 to support multiple vDSO data pages.

patch #3 introduces vclock_mode similiar to x86 to allow different
__arch_get_hw_counter() implementations for different clocksources.

patch #4 maps Hyper-V TSC page into vDSO data page.

patch #5 allows userspace to read cntvct, so that userspace can
efficiently read the clocksource.

patch #6 enables the vDSO for ARM64 Hyper-V guest.

The whole patchset is based on v5.5-rc1 plus Michael's ARM64 support
patchset, and I've done a few tests with:

	https://github.com/nlynch-mentor/vdsotest

Comments and suggestions are welcome!

Regards,
Boqun

Comments

Vincenzo Frascino Jan. 23, 2020, 10:48 a.m. UTC | #1
Hi Boqun Feng,

sorry for the late reply.

On 16/12/2019 00:19, Boqun Feng wrote:
> Hi,
> 
> This is the RFC patchset for vDSO support in ARM64 Hyper-V guest. To
> test it, Michael's ARM64 support patchset:
> 
> 	https://lore.kernel.org/linux-arm-kernel/1570129355-16005-1-git-send-email-mikelley@microsoft.com/
> 
> is needed.
> 
> Similar as x86, Hyper-V on ARM64 use a TSC page for guests to read
> the virtualized hardware timer, this TSC page is read-only for the
> guests, so could be used for vDSO data page. And the vDSO (userspace)
> code could use the same code for timer reading as kernel, since
> they read the same TSC page.
> 

I had a look to your patches and overall, I could not understand why we can't
use the arch_timer to do the same things you are doing with the one you
introduced in this series. What confuses me is that KVM works just fine with the
arch_timer which was designed with virtualization in mind. Why do we need
another one? Could you please explain?

> This patchset therefore extends ARM64's __vsdo_init() to allow multiple
> data pages and introduces the vclock_mode concept similar to x86 to
> allow different platforms (bare-metal, Hyper-V, etc.) to switch to
> different __arch_get_hw_counter() implementations. The rest of this
> patchset does the necessary setup for Hyper-V guests: mapping tsc page,
> enabling userspace to read cntvct, etc. to enable vDSO.
> 
> This patchset consists of 6 patches:
> 
> patch #1 allows hv_get_raw_timer() definition to be overridden for
> userspace and kernel to share the same hv_read_tsc_page() definition.
> 
> patch #2 extends ARM64 to support multiple vDSO data pages.
> 
> patch #3 introduces vclock_mode similiar to x86 to allow different
> __arch_get_hw_counter() implementations for different clocksources.
> 
> patch #4 maps Hyper-V TSC page into vDSO data page.
> 
> patch #5 allows userspace to read cntvct, so that userspace can
> efficiently read the clocksource.
> 
> patch #6 enables the vDSO for ARM64 Hyper-V guest.
> 
> The whole patchset is based on v5.5-rc1 plus Michael's ARM64 support
> patchset, and I've done a few tests with:
> 
> 	https://github.com/nlynch-mentor/vdsotest
> 
> Comments and suggestions are welcome!
> 
> Regards,
> Boqun
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
Boqun Feng Jan. 24, 2020, 6:32 a.m. UTC | #2
Hi Vincenzo,

On Thu, Jan 23, 2020 at 10:48:07AM +0000, Vincenzo Frascino wrote:
> Hi Boqun Feng,
> 
> sorry for the late reply.
> 

That's OK, thanks for your review ;-)

> On 16/12/2019 00:19, Boqun Feng wrote:
> > Hi,
> > 
> > This is the RFC patchset for vDSO support in ARM64 Hyper-V guest. To
> > test it, Michael's ARM64 support patchset:
> > 
> > 	https://lore.kernel.org/linux-arm-kernel/1570129355-16005-1-git-send-email-mikelley@microsoft.com/
> > 
> > is needed.
> > 
> > Similar as x86, Hyper-V on ARM64 use a TSC page for guests to read
> > the virtualized hardware timer, this TSC page is read-only for the
> > guests, so could be used for vDSO data page. And the vDSO (userspace)
> > code could use the same code for timer reading as kernel, since
> > they read the same TSC page.
> > 
> 
> I had a look to your patches and overall, I could not understand why we can't
> use the arch_timer to do the same things you are doing with the one you
> introduced in this series. What confuses me is that KVM works just fine with the
> arch_timer which was designed with virtualization in mind. Why do we need
> another one? Could you please explain?
> 

Please note that the guest VM on Hyper-V for ARM64 doesn't use
arch_timer as the clocksource. See:

	https://lore.kernel.org/linux-arm-kernel/1570129355-16005-7-git-send-email-mikelley@microsoft.com/

,  ACPI_SIG_GTDT is used for setting up Hyper-V synthetic clocksource
and other initialization work.

So just to be clear, your suggestion is

1) Hyper-V guest on ARM64 should use arch_timer as clocksource and vDSO
will just work.

or

2) Even though arch_timer is not used as the clocksource, we can still
use it for vDSO.

?

Regards,
Boqun

> > This patchset therefore extends ARM64's __vsdo_init() to allow multiple
> > data pages and introduces the vclock_mode concept similar to x86 to
> > allow different platforms (bare-metal, Hyper-V, etc.) to switch to
> > different __arch_get_hw_counter() implementations. The rest of this
> > patchset does the necessary setup for Hyper-V guests: mapping tsc page,
> > enabling userspace to read cntvct, etc. to enable vDSO.
> > 
> > This patchset consists of 6 patches:
> > 
> > patch #1 allows hv_get_raw_timer() definition to be overridden for
> > userspace and kernel to share the same hv_read_tsc_page() definition.
> > 
> > patch #2 extends ARM64 to support multiple vDSO data pages.
> > 
> > patch #3 introduces vclock_mode similiar to x86 to allow different
> > __arch_get_hw_counter() implementations for different clocksources.
> > 
> > patch #4 maps Hyper-V TSC page into vDSO data page.
> > 
> > patch #5 allows userspace to read cntvct, so that userspace can
> > efficiently read the clocksource.
> > 
> > patch #6 enables the vDSO for ARM64 Hyper-V guest.
> > 
> > The whole patchset is based on v5.5-rc1 plus Michael's ARM64 support
> > patchset, and I've done a few tests with:
> > 
> > 	https://github.com/nlynch-mentor/vdsotest
> > 
> > Comments and suggestions are welcome!
> > 
> > Regards,
> > Boqun
> > 
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> > 
> 
> -- 
> Regards,
> Vincenzo
Vincenzo Frascino Jan. 24, 2020, 10:24 a.m. UTC | #3
Hi Boqun Feng,

On 24/01/2020 06:32, Boqun Feng wrote:
> Hi Vincenzo,
> 

[...]

>>
>> I had a look to your patches and overall, I could not understand why we can't
>> use the arch_timer to do the same things you are doing with the one you
>> introduced in this series. What confuses me is that KVM works just fine with the
>> arch_timer which was designed with virtualization in mind. Why do we need
>> another one? Could you please explain?
>>
> 
> Please note that the guest VM on Hyper-V for ARM64 doesn't use
> arch_timer as the clocksource. See:
> 
> 	https://lore.kernel.org/linux-arm-kernel/1570129355-16005-7-git-send-email-mikelley@microsoft.com/
> 
> ,  ACPI_SIG_GTDT is used for setting up Hyper-V synthetic clocksource
> and other initialization work.
>

I had a look a look at it and my question stands, why do we need another timer
on arm64?

> So just to be clear, your suggestion is
> 
> 1) Hyper-V guest on ARM64 should use arch_timer as clocksource and vDSO
> will just work.
> 
> or
> 
> 2) Even though arch_timer is not used as the clocksource, we can still
> use it for vDSO.
> 
> ?
> 

Option #1 would be the preferred solution, unless there is a good reason against.

> Regards,
> Boqun
>
Boqun Feng Jan. 28, 2020, 5:58 a.m. UTC | #4
On Fri, Jan 24, 2020 at 10:24:44AM +0000, Vincenzo Frascino wrote:
> Hi Boqun Feng,
> 
> On 24/01/2020 06:32, Boqun Feng wrote:
> > Hi Vincenzo,
> > 
> 
> [...]
> 
> >>
> >> I had a look to your patches and overall, I could not understand why we can't
> >> use the arch_timer to do the same things you are doing with the one you
> >> introduced in this series. What confuses me is that KVM works just fine with the
> >> arch_timer which was designed with virtualization in mind. Why do we need
> >> another one? Could you please explain?
> >>
> > 
> > Please note that the guest VM on Hyper-V for ARM64 doesn't use
> > arch_timer as the clocksource. See:
> > 
> > 	https://lore.kernel.org/linux-arm-kernel/1570129355-16005-7-git-send-email-mikelley@microsoft.com/
> > 
> > ,  ACPI_SIG_GTDT is used for setting up Hyper-V synthetic clocksource
> > and other initialization work.
> >
> 
> I had a look a look at it and my question stands, why do we need another timer
> on arm64?
> 

Sorry for the late response. It's weekend and Chinese New Year, so I got
to spend some time making (and mostly eating) dumplings ;-)

After discussion with Michael, here is some explanation why we need
another timer:

The synthetic clocks that Hyper-V presents in a guest VM were originally
created for the x86 architecture. They provide a level of abstraction
that solves problems like continuity across live migrations where the
hardware clock (i.e., TSC in the case x86) frequency may be different
across the migration. When Hyper-V was brought to ARM64, this
abstraction was maintained to provide consistency across the x86 and
ARM64 architectures, and for both Windows and Linux guest VMs.   The
core Linux code for the Hyper-V clocks (in
drivers/clocksource/hyperv_timer.c) is architecture neutral and works on
both x86 and ARM64. As you can see, this part is done in Michael's
patchset.

Arguably, Hyper-V for ARM64 should have optimized for consistency with
the ARM64 community rather with the existing x86 implementation and
existing guest code in Windows. But at this point, it is what it is,
and the Hyper-V clocks do solve problems like migration that aren’t
addressed in ARM64 until v8.4 of the architecture with the addition of
the counter hardware scaling feature. Hyper-V doesn’t currently map the
ARM arch timer interrupts into guest VMs, so we need to use the existing
Hyper-V clocks and the common code that already exists.


Does the above answer your question?

Regards,
Boqun

> > So just to be clear, your suggestion is
> > 
> > 1) Hyper-V guest on ARM64 should use arch_timer as clocksource and vDSO
> > will just work.
> > 
> > or
> > 
> > 2) Even though arch_timer is not used as the clocksource, we can still
> > use it for vDSO.
> > 
> > ?
> > 
> 
> Option #1 would be the preferred solution, unless there is a good reason against.
> 
> > Regards,
> > Boqun
> > 
> 
> -- 
> Regards,
> Vincenzo
Marc Zyngier Jan. 28, 2020, 11:48 a.m. UTC | #5
On 2020-01-28 05:58, Boqun Feng wrote:
> On Fri, Jan 24, 2020 at 10:24:44AM +0000, Vincenzo Frascino wrote:
>> Hi Boqun Feng,
>> 
>> On 24/01/2020 06:32, Boqun Feng wrote:
>> > Hi Vincenzo,
>> >
>> 
>> [...]
>> 
>> >>
>> >> I had a look to your patches and overall, I could not understand why we can't
>> >> use the arch_timer to do the same things you are doing with the one you
>> >> introduced in this series. What confuses me is that KVM works just fine with the
>> >> arch_timer which was designed with virtualization in mind. Why do we need
>> >> another one? Could you please explain?
>> >>
>> >
>> > Please note that the guest VM on Hyper-V for ARM64 doesn't use
>> > arch_timer as the clocksource. See:
>> >
>> > 	https://lore.kernel.org/linux-arm-kernel/1570129355-16005-7-git-send-email-mikelley@microsoft.com/
>> >
>> > ,  ACPI_SIG_GTDT is used for setting up Hyper-V synthetic clocksource
>> > and other initialization work.
>> >
>> 
>> I had a look a look at it and my question stands, why do we need 
>> another timer
>> on arm64?
>> 
> 
> Sorry for the late response. It's weekend and Chinese New Year, so I 
> got
> to spend some time making (and mostly eating) dumplings ;-)

And you haven't been sharing! ;-)

> After discussion with Michael, here is some explanation why we need
> another timer:
> 
> The synthetic clocks that Hyper-V presents in a guest VM were 
> originally
> created for the x86 architecture. They provide a level of abstraction
> that solves problems like continuity across live migrations where the
> hardware clock (i.e., TSC in the case x86) frequency may be different
> across the migration. When Hyper-V was brought to ARM64, this
> abstraction was maintained to provide consistency across the x86 and
> ARM64 architectures, and for both Windows and Linux guest VMs.   The
> core Linux code for the Hyper-V clocks (in
> drivers/clocksource/hyperv_timer.c) is architecture neutral and works 
> on
> both x86 and ARM64. As you can see, this part is done in Michael's
> patchset.
> 
> Arguably, Hyper-V for ARM64 should have optimized for consistency with
> the ARM64 community rather with the existing x86 implementation and
> existing guest code in Windows. But at this point, it is what it is,
> and the Hyper-V clocks do solve problems like migration that aren’t
> addressed in ARM64 until v8.4 of the architecture with the addition of
> the counter hardware scaling feature. Hyper-V doesn’t currently map the
> ARM arch timer interrupts into guest VMs, so we need to use the 
> existing
> Hyper-V clocks and the common code that already exists.

The migration thing is a bit of a red herring. Do you really anticipate
VM migration across systems that have their timers running at different
frequencies *today*? And even if you did, there are ways to deal with it
with the arch timers (patches to that effect were posted on the list, 
and
there was even a bit of an ARM spec for it).

I find it odd to try and make arm64 "just another x86", while the 
architecture
gives you most of what you need already. I guess I'm tainted.

Thanks,

         M.