[RFC,V2] xen: interface: introduce pvclk interface

Message ID	1453278682-8750-1-git-send-email-van.freenix@gmail.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <xen-devel-bounces@lists.xen.org> From: Peng Fan <van.freenix@gmail.com> To: xen-devel@lists.xenproject.org Date: Wed, 20 Jan 2016 16:31:22 +0800 Message-Id: <1453278682-8750-1-git-send-email-van.freenix@gmail.com> Cc: Ian Campbell <ian.campbell@citrix.com>, Stefano Stabellini <stefano.stabellini@eu.citrix.com>, George Dunlap <george.dunlap@citrix.com>, Julien Grall <julien.grall@citrix.com>, David Vrabel <david.vrabel@citrix.com>, van.freenix@gmail.com Subject: [Xen-devel] [RFC V2] xen: interface: introduce pvclk interface Precedence: list MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org

Peng Fan Jan. 20, 2016, 8:31 a.m. UTC

Introduce pvclk interface which is useful when doing device passthrough
on ARM platform.

To ARM platform, saying embedded SoCs, clock management hardware IP
is controlled by Dom0, DomU does not have clocksource. So we need a
paravirtualized clock source to let DomU can handle device that are
passed through from Dom0.

Signed-off-by: Peng Fan <van.freenix@gmail.com>
Cc: George Dunlap <george.dunlap@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Julien Grall <julien.grall@citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
---

V2:
The V1 thread:
http://lists.xen.org/archives/html/xen-devel/2016-01/msg01860.html
The V1 thread binds the interface part to the implementation for linux kernel.
This V2 only contains the pvclk interface part.

In this patch:
Backend,
/local/domain/<X>/backend/vclk/<Y>/nr-clks
/local/domain/<X>/backend/vclk/<Y>/<Z>/name
Y is frontend domid, Z is clk id, name is the clk name.

Frontend,
/local/domain/<X>/device/vclk/clk-ring-ref
/local/domain/<X>/device/vclk/event-channel
/local/domain/<X>/device/vclk/<Y>/name
Y is the clk id.

My original idea is to pass clk name which is parsed from dts.
And not expose clk name to xenstore. DomU pass the name to Dom0,
Dom0 use the name to find correct clk and operate on the clk.

Not sure whether I am right or not, please advise.
I have no knowledge of ACPI, just following Ian's
comment on V1, I add the clk id and expose the info to xenstore,
but discard the devid/bus. If devid/bus is needed, I think we
can encoded it into the clk id part in future.
Exposing the info to xenstore means we need to add entry in xl
configuration file like this: vclks =  ["id:clkname", "id:clkname"] -->
vclks = ["1:uart2-root-clk", "3:gpu-root-clk"]. And the libxl pvclk
should parse the vclks entry and fill the contents to xenstore backend nodes.
The frontend xenstore node will be created, by driver probe function 
when DomU booting or by libxl pvclk before DomU boots.
To my use case, Dom0 and DomU both use device tree, I need to build
the mapping table between id and name, since I use name to lookup
the clk in backend, like this:
"clk = __clk_loopkup(clkname); clk_prepare_enable(clk)". Maybe ACPI
is another different case.

 xen/include/public/io/clkif.h | 129 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 129 insertions(+)
 create mode 100644 xen/include/public/io/clkif.h

Jürgen Groß Jan. 20, 2016, 9:05 a.m. UTC | #1

On 20/01/16 09:31, Peng Fan wrote:
> Introduce pvclk interface which is useful when doing device passthrough
> on ARM platform.

...

> +/*
> + * Frontend request
> + *
> + * cmd: command for operation on clk, should be XEN_CLK_[xx],
> + *	excluding XEN_CLK_END. id is the
> + * id: clk id
> + * rate: clock rate. Used for set rate.

Which unit? Hz?

> + */
> +struct xen_clkif_request {
> +	uint32_t cmd;
> +	uint32_t id;
> +	uint64_t rate;
> +};
> +typedef struct xen_clkif_request xen_clkif_request_t;
> +
> +/*
> + * Backend response
> + *
> + * cmd: command for operation on clk, same with the cmd in request.
> + * id: clk id, same with the id in request.
> + * success: indicate failure or success for the cmd.

Values?

> + * rate: clock rate. Used for get rate.
> + *
> + * cmd and id are filled by backend and passed to frontend to
> + * let frontend check whether the response is for the current
> + * request or not.

I'd rather let the frontend add a request Id to the request which
will be echoed here instead cmd and clock Id.


Juergen

Peng Fan Jan. 20, 2016, 9:25 a.m. UTC | #2

Hi Juergen,

On Wed, Jan 20, 2016 at 10:05:15AM +0100, Juergen Gross wrote:
>On 20/01/16 09:31, Peng Fan wrote:
>> Introduce pvclk interface which is useful when doing device passthrough
>> on ARM platform.
>
>...
>
>> +/*
>> + * Frontend request
>> + *
>> + * cmd: command for operation on clk, should be XEN_CLK_[xx],
>> + *	excluding XEN_CLK_END. id is the
>> + * id: clk id
>> + * rate: clock rate. Used for set rate.
>
>Which unit? Hz?

Yeah. Hz. I'll add comments in V3.

>
>> + */
>> +struct xen_clkif_request {
>> +	uint32_t cmd;
>> +	uint32_t id;
>> +	uint64_t rate;
>> +};
>> +typedef struct xen_clkif_request xen_clkif_request_t;
>> +
>> +/*
>> + * Backend response
>> + *
>> + * cmd: command for operation on clk, same with the cmd in request.
>> + * id: clk id, same with the id in request.
>> + * success: indicate failure or success for the cmd.
>
>Values?

I'd like to let the frontend/backend driver to determine the value.
In my implementation for linux, if the backend API supports return value,
I'll fill the return value to success entry. If the backend API returns
void, I'll just fill 0 to success entry.

>
>> + * rate: clock rate. Used for get rate.
>> + *
>> + * cmd and id are filled by backend and passed to frontend to
>> + * let frontend check whether the response is for the current
>> + * request or not.
>
>I'd rather let the frontend add a request Id to the request which
>will be echoed here instead cmd and clock Id.

If using request id, I think we can encode cmd and id into request id?
To my dts case, clk id is simple. But I am not sure about ACPI part. Maybe
I can not simply encode them into request id.
Wait for more comments on this:)

Thanks,
Peng.
>
>
>Juergen

Jan Beulich Jan. 20, 2016, 10:14 a.m. UTC | #3

>>> On 20.01.16 at 09:31, <van.freenix@gmail.com> wrote:
> +/*
> + * Backend response
> + *
> + * cmd: command for operation on clk, same with the cmd in request.
> + * id: clk id, same with the id in request.
> + * success: indicate failure or success for the cmd.
> + * rate: clock rate. Used for get rate.
> + *
> + * cmd and id are filled by backend and passed to frontend to
> + * let frontend check whether the response is for the current
> + * request or not.
> + */
> +struct xen_clkif_response {
> +	uint32_t cmd;
> +	uint32_t id;
> +	uint32_t success;
> +	uint64_t rate;
> +};

This isn't 32-/64-bit clean. Question is whether echoing cmd is really
needed.

Also naming a field "success" is pretty odd - is this mean to be a
boolean? Better name it e.g. status, allowing for different (error)
indicators.

And what I'm missing throughout the file is some description of how
clock events (interrupts?) are actually meant to make it into the
guest.

Jan

Jan Beulich Jan. 20, 2016, 10:16 a.m. UTC | #4

>>> On 20.01.16 at 10:25, <van.freenix@gmail.com> wrote:
> On Wed, Jan 20, 2016 at 10:05:15AM +0100, Juergen Gross wrote:
>>On 20/01/16 09:31, Peng Fan wrote:
>>> + */
>>> +struct xen_clkif_request {
>>> +	uint32_t cmd;
>>> +	uint32_t id;
>>> +	uint64_t rate;
>>> +};
>>> +typedef struct xen_clkif_request xen_clkif_request_t;
>>> +
>>> +/*
>>> + * Backend response
>>> + *
>>> + * cmd: command for operation on clk, same with the cmd in request.
>>> + * id: clk id, same with the id in request.
>>> + * success: indicate failure or success for the cmd.
>>
>>Values?
> 
> I'd like to let the frontend/backend driver to determine the value.

No, that would imply matched pairs of frontends and backends. Yet
the purpose of spelling out an interface is to allow interoperability.

Jan

Jürgen Groß Jan. 20, 2016, 10:40 a.m. UTC | #5

On 20/01/16 10:25, Peng Fan wrote:
> Hi Juergen,
> 
> On Wed, Jan 20, 2016 at 10:05:15AM +0100, Juergen Gross wrote:
>> On 20/01/16 09:31, Peng Fan wrote:
>>> Introduce pvclk interface which is useful when doing device passthrough
>>> on ARM platform.
>>
>> ...
>>
>>> +/*
>>> + * Frontend request
>>> + *
>>> + * cmd: command for operation on clk, should be XEN_CLK_[xx],
>>> + *	excluding XEN_CLK_END. id is the
>>> + * id: clk id
>>> + * rate: clock rate. Used for set rate.
>>
>> Which unit? Hz?
> 
> Yeah. Hz. I'll add comments in V3.
> 
>>
>>> + */
>>> +struct xen_clkif_request {
>>> +	uint32_t cmd;
>>> +	uint32_t id;
>>> +	uint64_t rate;
>>> +};
>>> +typedef struct xen_clkif_request xen_clkif_request_t;
>>> +
>>> +/*
>>> + * Backend response
>>> + *
>>> + * cmd: command for operation on clk, same with the cmd in request.
>>> + * id: clk id, same with the id in request.
>>> + * success: indicate failure or success for the cmd.
>>
>> Values?
> 
> I'd like to let the frontend/backend driver to determine the value.
> In my implementation for linux, if the backend API supports return value,
> I'll fill the return value to success entry. If the backend API returns
> void, I'll just fill 0 to success entry.

So please specify "0" to mean success and add some sensible error
values. There might be multiple frontend- and/or backend-variants which
all must agree using the same interface.

>>> + * rate: clock rate. Used for get rate.
>>> + *
>>> + * cmd and id are filled by backend and passed to frontend to
>>> + * let frontend check whether the response is for the current
>>> + * request or not.
>>
>> I'd rather let the frontend add a request Id to the request which
>> will be echoed here instead cmd and clock Id.
> 
> If using request id, I think we can encode cmd and id into request id?

This should just be an opaque value for the backend. The frontend is
free how to create this value, it should be unique for every outstanding
request of a domU, however. It could be built from cmd and id in case
there can't be multiple requests with the same cmd/id combination
active in a domU.


juergen

Peng Fan Jan. 20, 2016, 11:40 a.m. UTC | #6

Hi Jan,

On Wed, Jan 20, 2016 at 03:14:20AM -0700, Jan Beulich wrote:
>>>> On 20.01.16 at 09:31, <van.freenix@gmail.com> wrote:
>> +/*
>> + * Backend response
>> + *
>> + * cmd: command for operation on clk, same with the cmd in request.
>> + * id: clk id, same with the id in request.
>> + * success: indicate failure or success for the cmd.
>> + * rate: clock rate. Used for get rate.
>> + *
>> + * cmd and id are filled by backend and passed to frontend to
>> + * let frontend check whether the response is for the current
>> + * request or not.
>> + */
>> +struct xen_clkif_response {
>> +	uint32_t cmd;
>> +	uint32_t id;
>> +	uint32_t success;
>> +	uint64_t rate;
>> +};
>
>This isn't 32-/64-bit clean. Question is whether echoing cmd is really
>needed.

As Juergen suggested, use a request id. I'll fix this in V3.
32-/64-bit unclean, I can not get you here (:

>
>Also naming a field "success" is pretty odd - is this mean to be a
>boolean? Better name it e.g. status, allowing for different (error)
>indicators.

As you suggested, how about `int status`? And in this clkif.h,
define different status value, such as `#define XEN_CLK_PREPARE_OK 0,
#define XEN_CLK_PREPARE_FAILURE -1`. The frontend and backend should
be aware of the status value.

>
>And what I'm missing throughout the file is some description of how
>clock events (interrupts?) are actually meant to make it into the
>guest.

I have a simple implementation v1 patch for linux, http://lists.xen.org/archives/html/xen-devel/2016-01/msg01860.html.
You can see how it works.

Thanks,
Peng.

>
>Jan
>

Peng Fan Jan. 20, 2016, 11:48 a.m. UTC | #7

Hi Juergen,

On Wed, Jan 20, 2016 at 11:40:55AM +0100, Juergen Gross wrote:
>On 20/01/16 10:25, Peng Fan wrote:
>> Hi Juergen,
>> 
>> On Wed, Jan 20, 2016 at 10:05:15AM +0100, Juergen Gross wrote:
>>> On 20/01/16 09:31, Peng Fan wrote:
>>>> Introduce pvclk interface which is useful when doing device passthrough
>>>> on ARM platform.
>>>
>>> ...
>>>
>>>> +/*
>>>> + * Frontend request
>>>> + *
>>>> + * cmd: command for operation on clk, should be XEN_CLK_[xx],
>>>> + *	excluding XEN_CLK_END. id is the
>>>> + * id: clk id
>>>> + * rate: clock rate. Used for set rate.
>>>
>>> Which unit? Hz?
>> 
>> Yeah. Hz. I'll add comments in V3.
>> 
>>>
>>>> + */
>>>> +struct xen_clkif_request {
>>>> +	uint32_t cmd;
>>>> +	uint32_t id;
>>>> +	uint64_t rate;
>>>> +};
>>>> +typedef struct xen_clkif_request xen_clkif_request_t;
>>>> +
>>>> +/*
>>>> + * Backend response
>>>> + *
>>>> + * cmd: command for operation on clk, same with the cmd in request.
>>>> + * id: clk id, same with the id in request.
>>>> + * success: indicate failure or success for the cmd.
>>>
>>> Values?
>> 
>> I'd like to let the frontend/backend driver to determine the value.
>> In my implementation for linux, if the backend API supports return value,
>> I'll fill the return value to success entry. If the backend API returns
>> void, I'll just fill 0 to success entry.
>
>So please specify "0" to mean success and add some sensible error
>values. There might be multiple frontend- and/or backend-variants which
>all must agree using the same interface.

Will change this to `int status`, as also observed by Jan.
I'll also define macros such as "#define XEN_CLK_ENABLE_OK 0", "#define XEN_CLK_ENABLE_FAILURE -1"

>
>>>> + * rate: clock rate. Used for get rate.
>>>> + *
>>>> + * cmd and id are filled by backend and passed to frontend to
>>>> + * let frontend check whether the response is for the current
>>>> + * request or not.
>>>
>>> I'd rather let the frontend add a request Id to the request which
>>> will be echoed here instead cmd and clock Id.
>> 
>> If using request id, I think we can encode cmd and id into request id?
>
>This should just be an opaque value for the backend. The frontend is
>free how to create this value, it should be unique for every outstanding
>request of a domU, however. It could be built from cmd and id in case
>there can't be multiple requests with the same cmd/id combination
>active in a domU.

Thanks for teaching me on this. So the request id should be globally unique
for backend. So "random value(generated in frontend) + frontend domid" is ok for this?

Thanks,
Peng.
>
>
>juergen

Jan Beulich Jan. 20, 2016, 12:01 p.m. UTC | #8

>>> On 20.01.16 at 12:40, <van.freenix@gmail.com> wrote:
> Hi Jan,
> 
> On Wed, Jan 20, 2016 at 03:14:20AM -0700, Jan Beulich wrote:
>>>>> On 20.01.16 at 09:31, <van.freenix@gmail.com> wrote:
>>> +/*
>>> + * Backend response
>>> + *
>>> + * cmd: command for operation on clk, same with the cmd in request.
>>> + * id: clk id, same with the id in request.
>>> + * success: indicate failure or success for the cmd.
>>> + * rate: clock rate. Used for get rate.
>>> + *
>>> + * cmd and id are filled by backend and passed to frontend to
>>> + * let frontend check whether the response is for the current
>>> + * request or not.
>>> + */
>>> +struct xen_clkif_response {
>>> +	uint32_t cmd;
>>> +	uint32_t id;
>>> +	uint32_t success;
>>> +	uint64_t rate;
>>> +};
>>
>>This isn't 32-/64-bit clean. Question is whether echoing cmd is really
>>needed.
> 
> As Juergen suggested, use a request id. I'll fix this in V3.
> 32-/64-bit unclean, I can not get you here (:

The layout of above structure may end up different for 32- and
64-bit guests, depending on the alignment of uint64_t.

>>And what I'm missing throughout the file is some description of how
>>clock events (interrupts?) are actually meant to make it into the
>>guest.
> 
> I have a simple implementation v1 patch for linux, 
> http://lists.xen.org/archives/html/xen-devel/2016-01/msg01860.html.
> You can see how it works.

No, sorry, that's not the point of the inquiry. It seems to me that
there are more aspects to this interface, not directly related to
the request/reply protocol written down here, which need to be
spelled out event if they don't require any particular definitions
or type declarations.

Jan

Stefano Stabellini Jan. 20, 2016, 12:06 p.m. UTC | #9

On Wed, 20 Jan 2016, Peng Fan wrote:
> To my use case, Dom0 and DomU both use device tree, I need to build
> the mapping table between id and name, since I use name to lookup
> the clk in backend, like this:
> "clk = __clk_loopkup(clkname); clk_prepare_enable(clk)". Maybe ACPI
> is another different case.

Theoretically on systems using ACPI there is no need to fiddle with
clocks, see

Documentation/arm64/arm-acpi.txt

"In DT, clocks need to be specified
and the drivers need to take them into account.  In ACPI, the assumption
is that UEFI will leave the device in a reasonable default state, including
any clock settings.  If for some reason the driver needs to change a clock
value, this can be done in an ACPI method; all the driver needs to do is
invoke the method and not concern itself with what the method needs to do
to change the clock.  Changing the hardware can then take place over time
by changing what the ACPI method does, and not the driver.

In DT, the parameters needed by the driver to set up clocks as in the example
above are known as "bindings"; in ACPI, these are known as "Device Properties"
and provided to a driver via the _DSD object."

However currently we don't have the ability to run ACPI in DomU guests
on ARM. Even if we had, there is no way to call native ACPI methods from
any guests other than Dom0, even on x86. We just have to hope that
clocks don't need to be reset on ACPI systems.

Jürgen Groß Jan. 20, 2016, 12:11 p.m. UTC | #10

On 20/01/16 12:48, Peng Fan wrote:
> Hi Juergen,
> 
> On Wed, Jan 20, 2016 at 11:40:55AM +0100, Juergen Gross wrote:
>> On 20/01/16 10:25, Peng Fan wrote:
>>> Hi Juergen,
>>>
>>> On Wed, Jan 20, 2016 at 10:05:15AM +0100, Juergen Gross wrote:
>>>> On 20/01/16 09:31, Peng Fan wrote:
>>>>> Introduce pvclk interface which is useful when doing device passthrough
>>>>> on ARM platform.
>>>>
>>>> ...
>>>>
>>>>> +/*
>>>>> + * Frontend request
>>>>> + *
>>>>> + * cmd: command for operation on clk, should be XEN_CLK_[xx],
>>>>> + *	excluding XEN_CLK_END. id is the
>>>>> + * id: clk id
>>>>> + * rate: clock rate. Used for set rate.
>>>>
>>>> Which unit? Hz?
>>>
>>> Yeah. Hz. I'll add comments in V3.
>>>
>>>>
>>>>> + */
>>>>> +struct xen_clkif_request {
>>>>> +	uint32_t cmd;
>>>>> +	uint32_t id;
>>>>> +	uint64_t rate;
>>>>> +};
>>>>> +typedef struct xen_clkif_request xen_clkif_request_t;
>>>>> +
>>>>> +/*
>>>>> + * Backend response
>>>>> + *
>>>>> + * cmd: command for operation on clk, same with the cmd in request.
>>>>> + * id: clk id, same with the id in request.
>>>>> + * success: indicate failure or success for the cmd.
>>>>
>>>> Values?
>>>
>>> I'd like to let the frontend/backend driver to determine the value.
>>> In my implementation for linux, if the backend API supports return value,
>>> I'll fill the return value to success entry. If the backend API returns
>>> void, I'll just fill 0 to success entry.
>>
>> So please specify "0" to mean success and add some sensible error
>> values. There might be multiple frontend- and/or backend-variants which
>> all must agree using the same interface.
> 
> Will change this to `int status`, as also observed by Jan.
> I'll also define macros such as "#define XEN_CLK_ENABLE_OK 0", "#define XEN_CLK_ENABLE_FAILURE -1"
> 
>>
>>>>> + * rate: clock rate. Used for get rate.
>>>>> + *
>>>>> + * cmd and id are filled by backend and passed to frontend to
>>>>> + * let frontend check whether the response is for the current
>>>>> + * request or not.
>>>>
>>>> I'd rather let the frontend add a request Id to the request which
>>>> will be echoed here instead cmd and clock Id.
>>>
>>> If using request id, I think we can encode cmd and id into request id?
>>
>> This should just be an opaque value for the backend. The frontend is
>> free how to create this value, it should be unique for every outstanding
>> request of a domU, however. It could be built from cmd and id in case
>> there can't be multiple requests with the same cmd/id combination
>> active in a domU.
> 
> Thanks for teaching me on this. So the request id should be globally unique
> for backend. So "random value(generated in frontend) + frontend domid" is ok for this?

Being unique per frontend is enough, as each frontend will only ever see
responses for it's own requests. Make it as simple as possible. I guess
there will be a maximum of active requests possible, e.g. the number of
request slots in the request ring. You could use the index into the ring
as Id (pvSCSI frontend is doing it this way).

Juergen

Ian Campbell Jan. 20, 2016, 12:27 p.m. UTC | #11

On Wed, 2016-01-20 at 12:06 +0000, Stefano Stabellini wrote:
> On Wed, 20 Jan 2016, Peng Fan wrote:
> > To my use case, Dom0 and DomU both use device tree, I need to build
> > the mapping table between id and name, since I use name to lookup
> > the clk in backend, like this:
> > "clk = __clk_loopkup(clkname); clk_prepare_enable(clk)". Maybe ACPI
> > is another different case.
> 
> Theoretically on systems using ACPI there is no need to fiddle with
> clocks,

I mentioned ACPI in my replies to v1 more as a placeholder for "other
firmware descriptions than DT", in order that the pv protocol we end up
with does not end up being DT specific, which would be a mistake
irrespective of what may or may not be required for non-DT firmware
descriptions.

Ian.

>  see
> 
> Documentation/arm64/arm-acpi.txt
> 
> 
> "In DT, clocks need to be specified
> and the drivers need to take them into account.  In ACPI, the assumption
> is that UEFI will leave the device in a reasonable default state,
> including
> any clock settings.  If for some reason the driver needs to change a
> clock
> value, this can be done in an ACPI method; all the driver needs to do is
> invoke the method and not concern itself with what the method needs to do
> to change the clock.  Changing the hardware can then take place over time
> by changing what the ACPI method does, and not the driver.
> 
> In DT, the parameters needed by the driver to set up clocks as in the
> example
> above are known as "bindings"; in ACPI, these are known as "Device
> Properties"
> and provided to a driver via the _DSD object."
> 
> 
> However currently we don't have the ability to run ACPI in DomU guests
> on ARM. Even if we had, there is no way to call native ACPI methods from
> any guests other than Dom0, even on x86. We just have to hope that
> clocks don't need to be reset on ACPI systems.

Peng Fan Jan. 20, 2016, 1:52 p.m. UTC | #12

Hi Ian, Stefano

On Wed, Jan 20, 2016 at 12:27:07PM +0000, Ian Campbell wrote:
>On Wed, 2016-01-20 at 12:06 +0000, Stefano Stabellini wrote:
>> On Wed, 20 Jan 2016, Peng Fan wrote:
>> > To my use case, Dom0 and DomU both use device tree, I need to build
>> > the mapping table between id and name, since I use name to lookup
>> > the clk in backend, like this:
>> > "clk = __clk_loopkup(clkname); clk_prepare_enable(clk)". Maybe ACPI
>> > is another different case.
>> 
>> Theoretically on systems using ACPI there is no need to fiddle with
>> clocks,
>
>I mentioned ACPI in my replies to v1 more as a placeholder for "other
>firmware descriptions than DT", in order that the pv protocol we end up
>with does not end up being DT specific, which would be a mistake
>irrespective of what may or may not be required for non-DT firmware
>descriptions.

Thanks for clarifying. Beside this, are you ok with the xenstore node description in this file?

Thanks,
Peng.

>
>Ian.
>
>>  see
>> 
>> Documentation/arm64/arm-acpi.txt
>> 
>> 
>> "In DT, clocks need to be specified
>> and the drivers need to take them into account.  In ACPI, the assumption
>> is that UEFI will leave the device in a reasonable default state,
>> including
>> any clock settings.  If for some reason the driver needs to change a
>> clock
>> value, this can be done in an ACPI method; all the driver needs to do is
>> invoke the method and not concern itself with what the method needs to do
>> to change the clock.  Changing the hardware can then take place over time
>> by changing what the ACPI method does, and not the driver.
>> 
>> In DT, the parameters needed by the driver to set up clocks as in the
>> example
>> above are known as "bindings"; in ACPI, these are known as "Device
>> Properties"
>> and provided to a driver via the _DSD object."
>> 
>> 
>> However currently we don't have the ability to run ACPI in DomU guests
>> on ARM. Even if we had, there is no way to call native ACPI methods from
>> any guests other than Dom0, even on x86. We just have to hope that
>> clocks don't need to be reset on ACPI systems.

Peng Fan Jan. 20, 2016, 2:05 p.m. UTC | #13

Hi Jan,
On Wed, Jan 20, 2016 at 05:01:40AM -0700, Jan Beulich wrote:
>>>> On 20.01.16 at 12:40, <van.freenix@gmail.com> wrote:
>> Hi Jan,
>> 
>> On Wed, Jan 20, 2016 at 03:14:20AM -0700, Jan Beulich wrote:
>>>>>> On 20.01.16 at 09:31, <van.freenix@gmail.com> wrote:
>>>> +/*
>>>> + * Backend response
>>>> + *
>>>> + * cmd: command for operation on clk, same with the cmd in request.
>>>> + * id: clk id, same with the id in request.
>>>> + * success: indicate failure or success for the cmd.
>>>> + * rate: clock rate. Used for get rate.
>>>> + *
>>>> + * cmd and id are filled by backend and passed to frontend to
>>>> + * let frontend check whether the response is for the current
>>>> + * request or not.
>>>> + */
>>>> +struct xen_clkif_response {
>>>> +	uint32_t cmd;
>>>> +	uint32_t id;
>>>> +	uint32_t success;
>>>> +	uint64_t rate;
>>>> +};
>>>
>>>This isn't 32-/64-bit clean. Question is whether echoing cmd is really
>>>needed.
>> 
>> As Juergen suggested, use a request id. I'll fix this in V3.
>> 32-/64-bit unclean, I can not get you here (:
>
>The layout of above structure may end up different for 32- and
>64-bit guests, depending on the alignment of uint64_t.

Thanks for teaching me. In V3, the layout will be changed to like this
struct xen_clkif_response {
	uint32_t request_id;
	int32_t status;
	uint64_t rate;
};

And more macro definitions for the status entry:
#define XEN_CLK_OP_SUCCESS 0
#define XEN_CLK_ENABLE_FALIURE -1
#define XEN_CLK_DISABLE_FAILURE -2
#define XEN_CLK_PREPARE_FAILURE -3
#define XEN_CLK_UNPREPARE_FAILURE -4
#define XEN_CLK_SET_RATE_FAILURE -5
#define XEN_CLK_GET_RATE_FALIURE -6

>
>>>And what I'm missing throughout the file is some description of how
>>>clock events (interrupts?) are actually meant to make it into the
>>>guest.
>> 
>> I have a simple implementation v1 patch for linux, 
>> http://lists.xen.org/archives/html/xen-devel/2016-01/msg01860.html.
>> You can see how it works.
>
>No, sorry, that's not the point of the inquiry. It seems to me that
>there are more aspects to this interface, not directly related to
>the request/reply protocol written down here, which need to be
>spelled out event if they don't require any particular definitions
>or type declarations.

I try to follow you about clock events(interrupts?), not sure whether I understand correct or not.
clock in this file is the clk for a device. In linux side, it managed by clk subsystem, drivers/clk/xx.
This is not the clock events or clock source in drivers/clocksource/xx.
For the pvclk interface, there will be no interrupt for the guest.

I'll add more texts in the file or commit log.

Thanks,
Peng.

>
>Jan
>

Peng Fan Jan. 20, 2016, 2:13 p.m. UTC | #14

Hi Juergen,

On Wed, Jan 20, 2016 at 01:11:39PM +0100, Juergen Gross wrote:
>On 20/01/16 12:48, Peng Fan wrote:
>> Hi Juergen,
>> 
>> On Wed, Jan 20, 2016 at 11:40:55AM +0100, Juergen Gross wrote:
>>> On 20/01/16 10:25, Peng Fan wrote:
>>>> Hi Juergen,
>>>>
>>>> On Wed, Jan 20, 2016 at 10:05:15AM +0100, Juergen Gross wrote:
>>>>> On 20/01/16 09:31, Peng Fan wrote:
>>>>>> Introduce pvclk interface which is useful when doing device passthrough
>>>>>> on ARM platform.
>>>>>
>>>>> ...
>>>>>
>>>>>> +/*
>>>>>> + * Frontend request
>>>>>> + *
>>>>>> + * cmd: command for operation on clk, should be XEN_CLK_[xx],
>>>>>> + *	excluding XEN_CLK_END. id is the
>>>>>> + * id: clk id
>>>>>> + * rate: clock rate. Used for set rate.
>>>>>
>>>>> Which unit? Hz?
>>>>
>>>> Yeah. Hz. I'll add comments in V3.
>>>>
>>>>>
>>>>>> + */
>>>>>> +struct xen_clkif_request {
>>>>>> +	uint32_t cmd;
>>>>>> +	uint32_t id;
>>>>>> +	uint64_t rate;
>>>>>> +};
>>>>>> +typedef struct xen_clkif_request xen_clkif_request_t;
>>>>>> +
>>>>>> +/*
>>>>>> + * Backend response
>>>>>> + *
>>>>>> + * cmd: command for operation on clk, same with the cmd in request.
>>>>>> + * id: clk id, same with the id in request.
>>>>>> + * success: indicate failure or success for the cmd.
>>>>>
>>>>> Values?
>>>>
>>>> I'd like to let the frontend/backend driver to determine the value.
>>>> In my implementation for linux, if the backend API supports return value,
>>>> I'll fill the return value to success entry. If the backend API returns
>>>> void, I'll just fill 0 to success entry.
>>>
>>> So please specify "0" to mean success and add some sensible error
>>> values. There might be multiple frontend- and/or backend-variants which
>>> all must agree using the same interface.
>> 
>> Will change this to `int status`, as also observed by Jan.
>> I'll also define macros such as "#define XEN_CLK_ENABLE_OK 0", "#define XEN_CLK_ENABLE_FAILURE -1"
>> 
>>>
>>>>>> + * rate: clock rate. Used for get rate.
>>>>>> + *
>>>>>> + * cmd and id are filled by backend and passed to frontend to
>>>>>> + * let frontend check whether the response is for the current
>>>>>> + * request or not.
>>>>>
>>>>> I'd rather let the frontend add a request Id to the request which
>>>>> will be echoed here instead cmd and clock Id.
>>>>
>>>> If using request id, I think we can encode cmd and id into request id?
>>>
>>> This should just be an opaque value for the backend. The frontend is
>>> free how to create this value, it should be unique for every outstanding
>>> request of a domU, however. It could be built from cmd and id in case
>>> there can't be multiple requests with the same cmd/id combination
>>> active in a domU.
>> 
>> Thanks for teaching me on this. So the request id should be globally unique
>> for backend. So "random value(generated in frontend) + frontend domid" is ok for this?
>
>Being unique per frontend is enough, as each frontend will only ever see
>responses for it's own requests. Make it as simple as possible. I guess
>there will be a maximum of active requests possible, e.g. the number of
>request slots in the request ring. You could use the index into the ring
>as Id (pvSCSI frontend is doing it this way).

Thanks for this info.
In my case, such as let DomU handle uart2, I only need to let DomU
handle uart2-root-clk. Later I will passthrough gpu to DomU, only
gpu-root-clk needs be handled by DomU.
In linux side, clk operation is exclusive for one device(not definitely sure),
so the number of slots can be max number of devices that supported for device passthrough.

I'll take pvSCSI for reference.

Thanks,
Peng.

>
>Juergen

Jan Beulich Jan. 20, 2016, 2:16 p.m. UTC | #15

>>> On 20.01.16 at 15:05, <van.freenix@gmail.com> wrote:
> Hi Jan,
> On Wed, Jan 20, 2016 at 05:01:40AM -0700, Jan Beulich wrote:
>>>>> On 20.01.16 at 12:40, <van.freenix@gmail.com> wrote:
>>> Hi Jan,
>>> 
>>> On Wed, Jan 20, 2016 at 03:14:20AM -0700, Jan Beulich wrote:
>>>>>>> On 20.01.16 at 09:31, <van.freenix@gmail.com> wrote:
>>>>> +/*
>>>>> + * Backend response
>>>>> + *
>>>>> + * cmd: command for operation on clk, same with the cmd in request.
>>>>> + * id: clk id, same with the id in request.
>>>>> + * success: indicate failure or success for the cmd.
>>>>> + * rate: clock rate. Used for get rate.
>>>>> + *
>>>>> + * cmd and id are filled by backend and passed to frontend to
>>>>> + * let frontend check whether the response is for the current
>>>>> + * request or not.
>>>>> + */
>>>>> +struct xen_clkif_response {
>>>>> +	uint32_t cmd;
>>>>> +	uint32_t id;
>>>>> +	uint32_t success;
>>>>> +	uint64_t rate;
>>>>> +};
>>>>
>>>>This isn't 32-/64-bit clean. Question is whether echoing cmd is really
>>>>needed.
>>> 
>>> As Juergen suggested, use a request id. I'll fix this in V3.
>>> 32-/64-bit unclean, I can not get you here (:
>>
>>The layout of above structure may end up different for 32- and
>>64-bit guests, depending on the alignment of uint64_t.
> 
> Thanks for teaching me. In V3, the layout will be changed to like this
> struct xen_clkif_response {
> 	uint32_t request_id;
> 	int32_t status;
> 	uint64_t rate;
> };

Okay. Just as a not - iirc other pv interfaces use 64-bit ID values,
so not doing this here perhaps ought to be justified.

> And more macro definitions for the status entry:
> #define XEN_CLK_OP_SUCCESS 0
> #define XEN_CLK_ENABLE_FALIURE -1
> #define XEN_CLK_DISABLE_FAILURE -2
> #define XEN_CLK_PREPARE_FAILURE -3
> #define XEN_CLK_UNPREPARE_FAILURE -4
> #define XEN_CLK_SET_RATE_FAILURE -5
> #define XEN_CLK_GET_RATE_FALIURE -6

That's bogus - different kinds of errors would be meaningful,
but an error per command is quite pointless imo. (Also please
be aware of typos and parenthesization.)

>>>>And what I'm missing throughout the file is some description of how
>>>>clock events (interrupts?) are actually meant to make it into the
>>>>guest.
>>> 
>>> I have a simple implementation v1 patch for linux, 
>>> http://lists.xen.org/archives/html/xen-devel/2016-01/msg01860.html.
>>> You can see how it works.
>>
>>No, sorry, that's not the point of the inquiry. It seems to me that
>>there are more aspects to this interface, not directly related to
>>the request/reply protocol written down here, which need to be
>>spelled out event if they don't require any particular definitions
>>or type declarations.
> 
> I try to follow you about clock events(interrupts?), not sure whether I 
> understand correct or not.
> clock in this file is the clk for a device. In linux side, it managed by clk 
> subsystem, drivers/clk/xx.
> This is not the clock events or clock source in drivers/clocksource/xx.
> For the pvclk interface, there will be no interrupt for the guest.

Then (also considering the set of commands you propose) what
use is the clock to the guest? It can't get events from it, it can't
read its current value, all it can is get/set its rate, enable/disable,
and prepare/unprepare it. I may be lacking some ARM knowledge
here, but all of this looks pretty odd to me.

Jan

Peng Fan Jan. 20, 2016, 2:37 p.m. UTC | #16

Hi Jan,

On Wed, Jan 20, 2016 at 07:16:36AM -0700, Jan Beulich wrote:
>>>> On 20.01.16 at 15:05, <van.freenix@gmail.com> wrote:
>> Hi Jan,
>> On Wed, Jan 20, 2016 at 05:01:40AM -0700, Jan Beulich wrote:
>>>>>> On 20.01.16 at 12:40, <van.freenix@gmail.com> wrote:
>>>> Hi Jan,
>>>> 
>>>> On Wed, Jan 20, 2016 at 03:14:20AM -0700, Jan Beulich wrote:
>>>>>>>> On 20.01.16 at 09:31, <van.freenix@gmail.com> wrote:
>>>>>> +/*
>>>>>> + * Backend response
>>>>>> + *
>>>>>> + * cmd: command for operation on clk, same with the cmd in request.
>>>>>> + * id: clk id, same with the id in request.
>>>>>> + * success: indicate failure or success for the cmd.
>>>>>> + * rate: clock rate. Used for get rate.
>>>>>> + *
>>>>>> + * cmd and id are filled by backend and passed to frontend to
>>>>>> + * let frontend check whether the response is for the current
>>>>>> + * request or not.
>>>>>> + */
>>>>>> +struct xen_clkif_response {
>>>>>> +	uint32_t cmd;
>>>>>> +	uint32_t id;
>>>>>> +	uint32_t success;
>>>>>> +	uint64_t rate;
>>>>>> +};
>>>>>
>>>>>This isn't 32-/64-bit clean. Question is whether echoing cmd is really
>>>>>needed.
>>>> 
>>>> As Juergen suggested, use a request id. I'll fix this in V3.
>>>> 32-/64-bit unclean, I can not get you here (:
>>>
>>>The layout of above structure may end up different for 32- and
>>>64-bit guests, depending on the alignment of uint64_t.
>> 
>> Thanks for teaching me. In V3, the layout will be changed to like this
>> struct xen_clkif_response {
>> 	uint32_t request_id;
>> 	int32_t status;
>> 	uint64_t rate;
>> };
>
>Okay. Just as a not - iirc other pv interfaces use 64-bit ID values,
>so not doing this here perhaps ought to be justified.

oh. I see uint64_t id in blkif.h :). Thanks.

>
>> And more macro definitions for the status entry:
>> #define XEN_CLK_OP_SUCCESS 0
>> #define XEN_CLK_ENABLE_FALIURE -1
>> #define XEN_CLK_DISABLE_FAILURE -2
>> #define XEN_CLK_PREPARE_FAILURE -3
>> #define XEN_CLK_UNPREPARE_FAILURE -4
>> #define XEN_CLK_SET_RATE_FAILURE -5
>> #define XEN_CLK_GET_RATE_FALIURE -6
>
>That's bogus - different kinds of errors would be meaningful,
>but an error per command is quite pointless imo. (Also please
>be aware of typos and parenthesization.)

Will fine tune this in V3.

>
>>>>>And what I'm missing throughout the file is some description of how
>>>>>clock events (interrupts?) are actually meant to make it into the
>>>>>guest.
>>>> 
>>>> I have a simple implementation v1 patch for linux, 
>>>> http://lists.xen.org/archives/html/xen-devel/2016-01/msg01860.html.
>>>> You can see how it works.
>>>
>>>No, sorry, that's not the point of the inquiry. It seems to me that
>>>there are more aspects to this interface, not directly related to
>>>the request/reply protocol written down here, which need to be
>>>spelled out event if they don't require any particular definitions
>>>or type declarations.
>> 
>> I try to follow you about clock events(interrupts?), not sure whether I 
>> understand correct or not.
>> clock in this file is the clk for a device. In linux side, it managed by clk 
>> subsystem, drivers/clk/xx.
>> This is not the clock events or clock source in drivers/clocksource/xx.
>> For the pvclk interface, there will be no interrupt for the guest.
>
>Then (also considering the set of commands you propose) what
>use is the clock to the guest? It can't get events from it, it can't
>read its current value, all it can is get/set its rate, enable/disable,
>and prepare/unprepare it. I may be lacking some ARM knowledge
>here, but all of this looks pretty odd to me.

I follow this https://events.linuxfoundation.org/sites/events/files/slides/talk_5.pdf to do platform device
passthrough on ARM platform. I met the same issue as note in the ppt, at page 24.

In my test case, the uart driver works well without xen. There is serveral function calls in the driver, such as
clk = clk_get("uart2_root"),rate = clk_get_rate(clk), use rate to set baudrate for uart.


There is a clk tree in kernel without XEN or dom0 kernel like the following (only an example):
osc -
    |-->pll1
    |-->pll2
         |-->pll2_div
	         |-->uart2_gate
		         |-->uart2_root  --> clk for uart2

uart2_root is directly handled by Dom0.If I assign uart2 to DomU, DomU does
not have the clk tree as above, so DomU can not handle directly handle uart2_root and the uart2 driver in
DomU will run into failure to intialize the uart2 hardware IP.

So I introudce pvclk. Pass the operation for uart2_root in DomU to Dom0 and Dom0 directly handle the clock management hardware IP.

Hope this is clear.

Thanks,
Peng.

>
>Jan
>

Ian Campbell Jan. 20, 2016, 2:49 p.m. UTC | #17

On Wed, 2016-01-20 at 22:37 +0800, Peng Fan wrote:
> 
> > Then (also considering the set of commands you propose) what
> > use is the clock to the guest? It can't get events from it, it can't
> > read its current value, all it can is get/set its rate, enable/disable,
> > and prepare/unprepare it. I may be lacking some ARM knowledge
> > here, but all of this looks pretty odd to me.

Perhaps it helps to mention that these are the clocks which drive the
silicon in the device IP (in the "I have a 2GHz processor" sense), rather
than anything to do with time or (s/w visible) events.

On x86 these are not often under OS control, but on ARM it is common for
the network of clocks and associated muxers and dividers to be made
available to the OS in quite a fine grained manner and for
unused/unnecessary clocks to be disabled or under utilised ones to be
scaled to save power etc.

Ian.

Jan Beulich Jan. 20, 2016, 2:52 p.m. UTC | #18

>>> On 20.01.16 at 15:37, <van.freenix@gmail.com> wrote:
> On Wed, Jan 20, 2016 at 07:16:36AM -0700, Jan Beulich wrote:
>>>>> On 20.01.16 at 15:05, <van.freenix@gmail.com> wrote:
>>> On Wed, Jan 20, 2016 at 05:01:40AM -0700, Jan Beulich wrote:
>>>>>>> On 20.01.16 at 12:40, <van.freenix@gmail.com> wrote:
>>>>> On Wed, Jan 20, 2016 at 03:14:20AM -0700, Jan Beulich wrote:
>>>>>>>>> On 20.01.16 at 09:31, <van.freenix@gmail.com> wrote:
>>>>>>And what I'm missing throughout the file is some description of how
>>>>>>clock events (interrupts?) are actually meant to make it into the
>>>>>>guest.
>>>>> 
>>>>> I have a simple implementation v1 patch for linux, 
>>>>> http://lists.xen.org/archives/html/xen-devel/2016-01/msg01860.html.
>>>>> You can see how it works.
>>>>
>>>>No, sorry, that's not the point of the inquiry. It seems to me that
>>>>there are more aspects to this interface, not directly related to
>>>>the request/reply protocol written down here, which need to be
>>>>spelled out event if they don't require any particular definitions
>>>>or type declarations.
>>> 
>>> I try to follow you about clock events(interrupts?), not sure whether I 
>>> understand correct or not.
>>> clock in this file is the clk for a device. In linux side, it managed by clk 
>>> subsystem, drivers/clk/xx.
>>> This is not the clock events or clock source in drivers/clocksource/xx.
>>> For the pvclk interface, there will be no interrupt for the guest.
>>
>>Then (also considering the set of commands you propose) what
>>use is the clock to the guest? It can't get events from it, it can't
>>read its current value, all it can is get/set its rate, enable/disable,
>>and prepare/unprepare it. I may be lacking some ARM knowledge
>>here, but all of this looks pretty odd to me.
> 
> I follow this 
> https://events.linuxfoundation.org/sites/events/files/slides/talk_5.pdf to do 
> platform device
> passthrough on ARM platform. I met the same issue as note in the ppt, at 
> page 24.
> 
> In my test case, the uart driver works well without xen. There is serveral 
> function calls in the driver, such as
> clk = clk_get("uart2_root"),rate = clk_get_rate(clk), use rate to set 
> baudrate for uart.
> 
> 
> There is a clk tree in kernel without XEN or dom0 kernel like the following 
> (only an example):
> osc -
>     |-->pll1
>     |-->pll2
>          |-->pll2_div
> 	         |-->uart2_gate
> 		         |-->uart2_root  --> clk for uart2
> 
> uart2_root is directly handled by Dom0.If I assign uart2 to DomU, DomU does
> not have the clk tree as above, so DomU can not handle directly handle 
> uart2_root and the uart2 driver in
> DomU will run into failure to intialize the uart2 hardware IP.
> 
> So I introudce pvclk. Pass the operation for uart2_root in DomU to Dom0 and 
> Dom0 directly handle the clock management hardware IP.
> 
> Hope this is clear.

I'm afraid it's not, but it now looks even more like this is too ARM
specific for me to comment. I wonder whether a generic (not
ARM specific) PV I/O protocol is actually warranted here. In my
(presumably too simplistic) view controlling the clock of a passed
through platform device should be part of that pass-through,
not the subject of a PV side band protocol. From an abstract
pov the passed through device should work without any PV I/O
protocol; such protocols are generally only to accelerate I/O,
not to make things work in the first place.

Jan

Peng Fan Jan. 21, 2016, 1:29 a.m. UTC | #19

Hi Jan,

On Wed, Jan 20, 2016 at 07:52:58AM -0700, Jan Beulich wrote:
>>>> On 20.01.16 at 15:37, <van.freenix@gmail.com> wrote:
>> On Wed, Jan 20, 2016 at 07:16:36AM -0700, Jan Beulich wrote:
>>>>>> On 20.01.16 at 15:05, <van.freenix@gmail.com> wrote:
>>>> On Wed, Jan 20, 2016 at 05:01:40AM -0700, Jan Beulich wrote:
>>>>>>>> On 20.01.16 at 12:40, <van.freenix@gmail.com> wrote:
>>>>>> On Wed, Jan 20, 2016 at 03:14:20AM -0700, Jan Beulich wrote:
>>>>>>>>>> On 20.01.16 at 09:31, <van.freenix@gmail.com> wrote:
>>>>>>>And what I'm missing throughout the file is some description of how
>>>>>>>clock events (interrupts?) are actually meant to make it into the
>>>>>>>guest.
>>>>>> 
>>>>>> I have a simple implementation v1 patch for linux, 
>>>>>> http://lists.xen.org/archives/html/xen-devel/2016-01/msg01860.html.
>>>>>> You can see how it works.
>>>>>
>>>>>No, sorry, that's not the point of the inquiry. It seems to me that
>>>>>there are more aspects to this interface, not directly related to
>>>>>the request/reply protocol written down here, which need to be
>>>>>spelled out event if they don't require any particular definitions
>>>>>or type declarations.
>>>> 
>>>> I try to follow you about clock events(interrupts?), not sure whether I 
>>>> understand correct or not.
>>>> clock in this file is the clk for a device. In linux side, it managed by clk 
>>>> subsystem, drivers/clk/xx.
>>>> This is not the clock events or clock source in drivers/clocksource/xx.
>>>> For the pvclk interface, there will be no interrupt for the guest.
>>>
>>>Then (also considering the set of commands you propose) what
>>>use is the clock to the guest? It can't get events from it, it can't
>>>read its current value, all it can is get/set its rate, enable/disable,
>>>and prepare/unprepare it. I may be lacking some ARM knowledge
>>>here, but all of this looks pretty odd to me.
>> 
>> I follow this 
>> https://events.linuxfoundation.org/sites/events/files/slides/talk_5.pdf to do 
>> platform device
>> passthrough on ARM platform. I met the same issue as note in the ppt, at 
>> page 24.
>> 
>> In my test case, the uart driver works well without xen. There is serveral 
>> function calls in the driver, such as
>> clk = clk_get("uart2_root"),rate = clk_get_rate(clk), use rate to set 
>> baudrate for uart.
>> 
>> 
>> There is a clk tree in kernel without XEN or dom0 kernel like the following 
>> (only an example):
>> osc -
>>     |-->pll1
>>     |-->pll2
>>          |-->pll2_div
>> 	         |-->uart2_gate
>> 		         |-->uart2_root  --> clk for uart2
>> 
>> uart2_root is directly handled by Dom0.If I assign uart2 to DomU, DomU does
>> not have the clk tree as above, so DomU can not handle directly handle 
>> uart2_root and the uart2 driver in
>> DomU will run into failure to intialize the uart2 hardware IP.
>> 
>> So I introudce pvclk. Pass the operation for uart2_root in DomU to Dom0 and 
>> Dom0 directly handle the clock management hardware IP.
>> 
>> Hope this is clear.
>
>I'm afraid it's not, but it now looks even more like this is too ARM
>specific for me to comment. I wonder whether a generic (not
>ARM specific) PV I/O protocol is actually warranted here. In my
>(presumably too simplistic) view controlling the clock of a passed
>through platform device should be part of that pass-through,
>not the subject of a PV side band protocol.From an abstract
>pov the passed through device should work without any PV I/O
>protocol; such protocols are generally only to accelerate I/O,
>not to make things work in the first place.

The platform device passthrough part for arm is to mapping the machine io address
to the guest physical io address. Then guest can map the phsical io address to its
own virtual address, then by accessing virtual address, guest can access machine io address space.
So the platform device passthrough does not needs frontend/backend driver to support, except smmu is handled by xen.

But the platform device needs clk to drive the hardware IP, also may needs pinmux settings.
the driver in guest needs to drive the hardware IP passed through to the guest, so it needs to operate on the clk.

Just pasted comments from George for V1:

"
Just speaking from the perspective of a Xen dev who's not an ARM dev:
a few more words on the relationship between pvclk and
device-passthrough would be helpful to set the context.  It sounds
like:

* On ARM, passing through a device requires a clocksource (at least
for many devices)

* dom0 has the hardware clocksource, but at the moment domUs don't
have a suitable clocksource

* This patch implements pvclk front/backend suitable for such devices

Is that right?  In which case something like the following would be helpful:

"This patch introduces pvclk, a paravirtualized clock source suitable
for devices to use when passing through to domUs on ARM systems."
"

Since my use case is for ARM embedded products, X86 may not need this. I try to make this interface common,
but not sure.. If you have better ideas, please advise me.

Thanks,
Peng.

>
>Jan
>

Jan Beulich Jan. 21, 2016, 7:53 a.m. UTC | #20

>>> On 21.01.16 at 02:29, <van.freenix@gmail.com> wrote:
> The platform device passthrough part for arm is to mapping the machine io 
> address
> to the guest physical io address. Then guest can map the phsical io address 
> to its
> own virtual address, then by accessing virtual address, guest can access 
> machine io address space.
> So the platform device passthrough does not needs frontend/backend driver to 
> support, except smmu is handled by xen.
> 
> But the platform device needs clk to drive the hardware IP, also may needs 
> pinmux settings.
> the driver in guest needs to drive the hardware IP passed through to the 
> guest, so it needs to operate on the clk.
> 
> Just pasted comments from George for V1:
> 
> "
> Just speaking from the perspective of a Xen dev who's not an ARM dev:
> a few more words on the relationship between pvclk and
> device-passthrough would be helpful to set the context.  It sounds
> like:
> 
> * On ARM, passing through a device requires a clocksource (at least
> for many devices)
> 
> * dom0 has the hardware clocksource, but at the moment domUs don't
> have a suitable clocksource
> 
> * This patch implements pvclk front/backend suitable for such devices
> 
> Is that right?  In which case something like the following would be helpful:
> 
> "This patch introduces pvclk, a paravirtualized clock source suitable
> for devices to use when passing through to domUs on ARM systems."
> "

That's a possible perspective to take, but not the only one. In
fact, coming to what I said previously, I wonder whether placing
the "backend" in Dom0 is the right thing in the first place -
fundamentally arbitration of hardware use should be done
(or at least checked/enforced) by the hypervisor. I.e. just like
while Dom0 may assign a PCI device to a guest, the hypervisor
is in charge of actually making all the resources needed for this
to work accessible to the guest, and/or verifying that permissions
are in place (like e.g. when setting up interrupts). Yet the model
proposed here completely bypasses the hypervisor afaict.

Are there connections between a platform device and its clock(s),
e.g. in DT? If so, wouldn't it be possible for granting access to a
platform device to imply granting control of the respective clock?
In which case clock control might perhaps better become a
hypercall based mechanism? And further - are all clocks in use for
at most one platform device (i.e. is there no sharing possible)? If
not, how do you envision to make multiple parties agree on the
clock settings and state?

> Since my use case is for ARM embedded products, X86 may not need this.

Which already points at one of the issues in your Linux patches:
The drivers did get enabled unconditionally iirc, which would
need changing especially when they're likely useless on x86.

> I try to make this interface common,
> but not sure.

I'm not sure either.

Jan

Peng Fan Jan. 21, 2016, 8:59 a.m. UTC | #21

Hi Jan,

On Thu, Jan 21, 2016 at 12:53:01AM -0700, Jan Beulich wrote:
>>>> On 21.01.16 at 02:29, <van.freenix@gmail.com> wrote:
>> The platform device passthrough part for arm is to mapping the machine io 
>> address
>> to the guest physical io address. Then guest can map the phsical io address 
>> to its
>> own virtual address, then by accessing virtual address, guest can access 
>> machine io address space.
>> So the platform device passthrough does not needs frontend/backend driver to 
>> support, except smmu is handled by xen.
>> 
>> But the platform device needs clk to drive the hardware IP, also may needs 
>> pinmux settings.
>> the driver in guest needs to drive the hardware IP passed through to the 
>> guest, so it needs to operate on the clk.
>> 
>> Just pasted comments from George for V1:
>> 
>> "
>> Just speaking from the perspective of a Xen dev who's not an ARM dev:
>> a few more words on the relationship between pvclk and
>> device-passthrough would be helpful to set the context.  It sounds
>> like:
>> 
>> * On ARM, passing through a device requires a clocksource (at least
>> for many devices)
>> 
>> * dom0 has the hardware clocksource, but at the moment domUs don't
>> have a suitable clocksource
>> 
>> * This patch implements pvclk front/backend suitable for such devices
>> 
>> Is that right?  In which case something like the following would be helpful:
>> 
>> "This patch introduces pvclk, a paravirtualized clock source suitable
>> for devices to use when passing through to domUs on ARM systems."
>> "
>
>That's a possible perspective to take, but not the only one. In
>fact, coming to what I said previously, I wonder whether placing
>the "backend" in Dom0 is the right thing in the first place -
>fundamentally arbitration of hardware use should be done
>(or at least checked/enforced) by the hypervisor. I.e. just like
>while Dom0 may assign a PCI device to a guest, the hypervisor
>is in charge of actually making all the resources needed for this
>to work accessible to the guest, and/or verifying that permissions
>are in place (like e.g. when setting up interrupts). Yet the model
>proposed here completely bypasses the hypervisor afaict.

To platform device of ARM, hypervisor is responsible for the mapping
between machine address and guest physical address, also responsible
for the irq mapping.

But to embedded ARM SoC, the hardware clk IP is handled in Dom0.
On my i.MX platform, the hardware clk IP named
Clock Controller Module will output clks to drive different IPs, such as
uart,gpu,lcd,sd controller,scsi controller,pcie controller.
The hardware clock IP is not same for all ARM SoC vendors. ARM has
spec, such as GIC and SMMU to ask SoC vendors follow the spec, then
it's easy to let hypervisor handle them. But there is no common spec
for the clock IP.

>
>Are there connections between a platform device and its clock(s),
>e.g. in DT? If so, wouldn't it be possible for granting access to a
>platform device to imply granting control of the respective clock?

clock hardware IP is also a device. The following is partial dts for i.MX7Dual.
			clks: ccm@30380000 {
				compatible = "fsl,imx7d-ccm";
				reg = <0x30380000 0x10000>;
				interrupts = <GIC_SPI 85 IRQ_TYPE_LEVEL_HIGH>,
					     <GIC_SPI 86 IRQ_TYPE_LEVEL_HIGH>;
				#clock-cells = <1>;
				clocks = <&ckil>, <&osc>;
				clock-names = "ckil", "osc";
			};

			uart2: serial@30890000 {
				compatible = "fsl,imx7d-uart",
					     "fsl,imx6q-uart";
				reg = <0x30890000 0x10000>;
				interrupts = <GIC_SPI 27 IRQ_TYPE_LEVEL_HIGH>;
				clocks = <&clks IMX7D_UART2_ROOT_CLK>,
					<&clks IMX7D_UART2_ROOT_CLK>;
				clock-names = "ipg", "per";
				status = "disabled";
			};
uart2 needs clock IMX7D_UART2_ROOT_CLK from the ccm.
passthrough uart2, hypervisor handles the reg and interrupts, that is because
hypervisor handles the memory map and the interrupt controller(GIC). But here
CCM is not handled by hypervisor, it is handled by Dom0.

>In which case clock control might perhaps better become a
>hypercall based mechanism? And further - are all clocks in use for
>at most one platform device (i.e. is there no sharing possible)? If
>not, how do you envision to make multiple parties agree on the
>clock settings and state?

For ARMV8 server products, I am not sure whether hypercall is better; but to my
case, it's not feasible to use hypercall.

Dom0 handles all the clocks, DomU just send request to Dom0 and ask Dom0 to
enable/disable/set rate for a clock for the device. So I think it's okay
for multipile parties, the clk subsystem in Dom0 can handle mutiple requests even
if the clock is shared. Anyway it also depends on the design of frontend/backend driver.

>
>> Since my use case is for ARM embedded products, X86 may not need this.
>
>Which already points at one of the issues in your Linux patches:
>The drivers did get enabled unconditionally iirc, which would
>need changing especially when they're likely useless on x86.

(:-. This is an initial patch mainly for asking comments on the way I implemented
to handle clock for platform device passthrough, and I hope xen and linux experts can advise me if
the way I implemented is bad or they have better methods.
Maybe need "CONFIG_XEN_CLK" in the makefile.

Thanks,
Peng.
>
>> I try to make this interface common,
>> but not sure.
>
>I'm not sure either.
>
>Jan
>

Ian Campbell Jan. 21, 2016, 10:19 a.m. UTC | #22

On Thu, 2016-01-21 at 16:59 +0800, Peng Fan wrote:
> 
> To platform device of ARM, hypervisor is responsible for the mapping
> between machine address and guest physical address, also responsible
> for the irq mapping.
> 
> But to embedded ARM SoC, the hardware clk IP is handled in Dom0.

Arguably Xen ought to be the one to do this, but we have determined
(rightly, I think) that doing so for the entirely clk tree of an SoC would
involve importing an unmanageable amount of code into Xen[0].

In the meantime we defer this to Dom0.

I wonder though if it would be possible to manage the clocks for a
passthrough device from the toolstack, i.e. is there a sysfs node where one
can say "keep the clock for this device enabled (at xMhz) even though you
think the device is unused"?

If so (or if it can be easily added) then the toolstack would just need to
manage that value as part of the passthrough process rather than having the
frontend do it via a PV protocol.

Ian.

[0] I'd like at some point for Xen to gain sufficient knowledge of clock IP
to minimally control things like the CPU and DRAM clocks etc, but that
needn't imply full clock tree support and would hopefully be a manageable
amount of code, which would (hopefully) mostly be about trapping and
emulating writes to one or two clock controllers per platform and
arbitrating a small set of bits (while allowing dom0 to control the
others). This is at least a medium if not long term idea though, and for
all I know it might turn out to be unworkable in practice.

Jan Beulich Jan. 21, 2016, 10:21 a.m. UTC | #23

>>> On 21.01.16 at 09:59, <van.freenix@gmail.com> wrote:
> uart2 needs clock IMX7D_UART2_ROOT_CLK from the ccm.
> passthrough uart2, hypervisor handles the reg and interrupts, that is 
> because
> hypervisor handles the memory map and the interrupt controller(GIC). But 
> here
> CCM is not handled by hypervisor, it is handled by Dom0.

This, I take it, describes the current state, which doesn't make clear
whether this is intentionally that way (and can't be changed), or
just happens to be that way because so far it didn't matter.

> For ARMV8 server products, I am not sure whether hypercall is better; but to 
> my case, it's not feasible to use hypercall.

Because of ...?

> Dom0 handles all the clocks, DomU just send request to Dom0 and ask Dom0 to
> enable/disable/set rate for a clock for the device. So I think it's okay
> for multipile parties, the clk subsystem in Dom0 can handle mutiple requests 
> even if the clock is shared.

I.e. if one party sets one rate, and later another party sets
a different rate, things are going to work fine? If so, why would
the different parties need control over the rate in the first place?

Jan

Peng Fan Jan. 21, 2016, 11:55 a.m. UTC | #24

Hi Ian,

On Thu, Jan 21, 2016 at 10:19:32AM +0000, Ian Campbell wrote:
>On Thu, 2016-01-21 at 16:59 +0800, Peng Fan wrote:
>> 
>> To platform device of ARM, hypervisor is responsible for the mapping
>> between machine address and guest physical address, also responsible
>> for the irq mapping.
>> 
>> But to embedded ARM SoC, the hardware clk IP is handled in Dom0.
>
>Arguably Xen ought to be the one to do this, but we have determined
>(rightly, I think) that doing so for the entirely clk tree of an SoC would
>involve importing an unmanageable amount of code into Xen[0].
>
>In the meantime we defer this to Dom0.
>
>I wonder though if it would be possible to manage the clocks for a
>passthrough device from the toolstack, i.e. is there a sysfs node where one
>can say "keep the clock for this device enabled (at xMhz) even though you
>think the device is unused"?

I am afraid not. The linux device driver without xen can work well, because
there is clk tree and "clk_get,clk_prepare,clk_set_rate" work well in the driver.
I do not want to remove the clk apis usage from the device driver for xen, because the driver
also serves when no xen support. The pvclk interface is mainly to let the clk api can work as no xen.

Introduce pvclk may introduce more stuff, I have no more idea for now, 
if want to let the device driver unchanged and work well for passed through device. (:

>
>If so (or if it can be easily added) then the toolstack would just need to
>manage that value as part of the passthrough process rather than having the
>frontend do it via a PV protocol.
>
>Ian.
>
>[0] I'd like at some point for Xen to gain sufficient knowledge of clock IP
>to minimally control things like the CPU and DRAM clocks etc, but that
>needn't imply full clock tree support and would hopefully be a manageable
>amount of code, which would (hopefully) mostly be about trapping and
>emulating writes to one or two clock controllers per platform and
>arbitrating a small set of bits (while allowing dom0 to control the
>others). This is at least a medium if not long term idea though, and for
>all I know it might turn out to be unworkable in practice.

I am happy to see that if one day, xen takes part of the clock management to minimize power usage.
These few days, I also look into the power management of xen. seems it mainly serves for x86.
Let xen hypervisor handle cpu and dram clocks may be better than let dom0.

Thanks,
Peng.

>

Peng Fan Jan. 21, 2016, 12:06 p.m. UTC | #25

Hi Jan,

On Thu, Jan 21, 2016 at 03:21:38AM -0700, Jan Beulich wrote:
>>>> On 21.01.16 at 09:59, <van.freenix@gmail.com> wrote:
>> uart2 needs clock IMX7D_UART2_ROOT_CLK from the ccm.
>> passthrough uart2, hypervisor handles the reg and interrupts, that is 
>> because
>> hypervisor handles the memory map and the interrupt controller(GIC). But 
>> here
>> CCM is not handled by hypervisor, it is handled by Dom0.
>
>This, I take it, describes the current state, which doesn't make clear
>whether this is intentionally that way (and can't be changed), or
>just happens to be that way because so far it didn't matter.
>
>> For ARMV8 server products, I am not sure whether hypercall is better; but to 
>> my case, it's not feasible to use hypercall.
>
>Because of ...?

I guess you mean DomU issues hypercall and Xen forwards the request to Dom0,
then Dom0 reply the response?

If you experts think pvclk is not a good way to handle the clock for passed through
device, I can try hypercall way.

>
>> Dom0 handles all the clocks, DomU just send request to Dom0 and ask Dom0 to
>> enable/disable/set rate for a clock for the device. So I think it's okay
>> for multipile parties, the clk subsystem in Dom0 can handle mutiple requests 
>> even if the clock is shared.
>
>I.e. if one party sets one rate, and later another party sets
>a different rate, things are going to work fine? If so, why would
>the different parties need control over the rate in the first place?

oh. thanks for teaching me. If two IPs share one clock, and IP1 for Dom0, IP2 for DomU,
Dom0 needs clock work at 20Hz, but DomU want clock work at 40Hz. pvclk can not avoid this.

If not using pvclk, we develop a new method to handle clock. I guest we can also not avoid this?

Thanks,
Peng.

>
>Jan
>

Ian Campbell Jan. 21, 2016, 12:26 p.m. UTC | #26

On Thu, 2016-01-21 at 19:55 +0800, Peng Fan wrote:
> Hi Ian,
> 
> On Thu, Jan 21, 2016 at 10:19:32AM +0000, Ian Campbell wrote:
> > On Thu, 2016-01-21 at 16:59 +0800, Peng Fan wrote:
> > >  
> > > To platform device of ARM, hypervisor is responsible for the mapping
> > > between machine address and guest physical address, also responsible
> > > for the irq mapping.
> > > 
> > > But to embedded ARM SoC, the hardware clk IP is handled in Dom0.
> > 
> > Arguably Xen ought to be the one to do this, but we have determined
> > (rightly, I think) that doing so for the entirely clk tree of an SoC
> > would
> > involve importing an unmanageable amount of code into Xen[0].
> > 
> > In the meantime we defer this to Dom0.
> > 
> > I wonder though if it would be possible to manage the clocks for a
> > passthrough device from the toolstack, i.e. is there a sysfs node where
> > one
> > can say "keep the clock for this device enabled (at xMhz) even though
> > you
> > think the device is unused"?
> 
> I am afraid not. The linux device driver without xen can work well,
> because
> there is clk tree and "clk_get,clk_prepare,clk_set_rate" work well in the
> driver.
> I do not want to remove the clk apis usage from the device driver for
> xen, because the driver
> also serves when no xen support. The pvclk interface is mainly to let the
> clk api can work as no xen.

Would adding a dummy fixed-clock[0] (or several of them) to the guest
passthrough DT satisfy the driver's requirements? They would be hardcoded
to whatever rate dom0 and/or the tools has decided upon (and had set in the
real h/w).

Ian.

[0] Documentation/devicetree/bindings/clock/fixed-clock.txt

Peng Fan Jan. 21, 2016, 12:35 p.m. UTC | #27

Hi Ian,

On Thu, Jan 21, 2016 at 12:26:04PM +0000, Ian Campbell wrote:
>On Thu, 2016-01-21 at 19:55 +0800, Peng Fan wrote:
>> Hi Ian,
>> 
>> On Thu, Jan 21, 2016 at 10:19:32AM +0000, Ian Campbell wrote:
>> > On Thu, 2016-01-21 at 16:59 +0800, Peng Fan wrote:
>> > >  
>> > > To platform device of ARM, hypervisor is responsible for the mapping
>> > > between machine address and guest physical address, also responsible
>> > > for the irq mapping.
>> > > 
>> > > But to embedded ARM SoC, the hardware clk IP is handled in Dom0.
>> > 
>> > Arguably Xen ought to be the one to do this, but we have determined
>> > (rightly, I think) that doing so for the entirely clk tree of an SoC
>> > would
>> > involve importing an unmanageable amount of code into Xen[0].
>> > 
>> > In the meantime we defer this to Dom0.
>> > 
>> > I wonder though if it would be possible to manage the clocks for a
>> > passthrough device from the toolstack, i.e. is there a sysfs node where
>> > one
>> > can say "keep the clock for this device enabled (at xMhz) even though
>> > you
>> > think the device is unused"?
>> 
>> I am afraid not. The linux device driver without xen can work well,
>> because
>> there is clk tree and "clk_get,clk_prepare,clk_set_rate" work well in the
>> driver.
>> I do not want to remove the clk apis usage from the device driver for
>> xen, because the driver
>> also serves when no xen support. The pvclk interface is mainly to let the
>> clk api can work as no xen.
>
>Would adding a dummy fixed-clock[0] (or several of them) to the guest
>passthrough DT satisfy the driver's requirements? They would be hardcoded
>to whatever rate dom0 and/or the tools has decided upon (and had set in the
>real h/w).

If using this way, we have the assumption that DomU device driver would not
change the rate of the clock driving the device. I am not sure whether this is
ok for so many platform devices based ARM core.

In /sys/kernel/debug/clk/...., there are clk tree info, but
clk api are not exposed to userspace as far as I know, so
if using sysfs interface to set a known fixed rate or enable/disable the clock,
we need to expose the clk info to userspace.

Jan said using hypercall in the other mail, do you have any ideas about this?

Thanks,
Peng.

>
>Ian.
>
>[0] Documentation/devicetree/bindings/clock/fixed-clock.txt
>

Ian Campbell Jan. 21, 2016, 12:49 p.m. UTC | #28

On Thu, 2016-01-21 at 20:35 +0800, Peng Fan wrote:
> On Thu, Jan 21, 2016 at 12:26:04PM +0000, Ian Campbell wrote:
> > Would adding a dummy fixed-clock[0] (or several of them) to the guest
> > passthrough DT satisfy the driver's requirements? They would be hardcoded
> > to whatever rate dom0 and/or the tools has decided upon (and had set in the
> > real h/w).
> 
> If using this way, we have the assumption that DomU device driver would not
> change the rate of the clock driving the device. I am not sure whether this is
> ok for so many platform devices based ARM core.

Don't (non-buggy) drivers already need to cope with the possibility that
the clk core may not be able to satisfy their request for a particular rate
due to constraints from other ip blocks?

I would also expect the user to want to configure things in dom0 such that
the specific drivers used in domU are satisfied with what they get (which
is a bit fiddly perhaps, but platform device passthrough already is
somewhat that way).

> In /sys/kernel/debug/clk/...., there are clk tree info, but
> clk api are not exposed to userspace as far as I know, so
> if using sysfs interface to set a known fixed rate or enable/disable the clock,
> we need to expose the clk info to userspace.

That seems possible to arrange given a use case for it though, a SMOP (and
convincing the clk maintainer of the need).

Worst case is a xen-clkback driver or perhaps even vfio will want to do
something like this, we can't normally use vfio on Xen, but in this case
perhaps we could.

> Jan said using hypercall in the other mail, do you have any ideas about
> this?

This would need a whole clock infrastructure in Xen, wouldn't it? I
described why that is not currently available in an earlier mail, and also
my opinion that doing the whole thing in Xen would involve pulling in far
too much SoC specific code for each specific platform.

Ian.
>

Jan Beulich Jan. 21, 2016, 12:52 p.m. UTC | #29

>>> On 21.01.16 at 13:06, <van.freenix@gmail.com> wrote:
> On Thu, Jan 21, 2016 at 03:21:38AM -0700, Jan Beulich wrote:
>>>>> On 21.01.16 at 09:59, <van.freenix@gmail.com> wrote:
>>> uart2 needs clock IMX7D_UART2_ROOT_CLK from the ccm.
>>> passthrough uart2, hypervisor handles the reg and interrupts, that is 
>>> because
>>> hypervisor handles the memory map and the interrupt controller(GIC). But 
>>> here
>>> CCM is not handled by hypervisor, it is handled by Dom0.
>>
>>This, I take it, describes the current state, which doesn't make clear
>>whether this is intentionally that way (and can't be changed), or
>>just happens to be that way because so far it didn't matter.
>>
>>> For ARMV8 server products, I am not sure whether hypercall is better; but to
>>> my case, it's not feasible to use hypercall.
>>
>>Because of ...?
> 
> I guess you mean DomU issues hypercall and Xen forwards the request to Dom0,
> then Dom0 reply the response?

Well, no, that wouldn't be a desirable (or even sane) model.

>>> Dom0 handles all the clocks, DomU just send request to Dom0 and ask Dom0 to
>>> enable/disable/set rate for a clock for the device. So I think it's okay
>>> for multipile parties, the clk subsystem in Dom0 can handle mutiple requests 
>>> even if the clock is shared.
>>
>>I.e. if one party sets one rate, and later another party sets
>>a different rate, things are going to work fine? If so, why would
>>the different parties need control over the rate in the first place?
> 
> oh. thanks for teaching me. If two IPs share one clock, and IP1 for Dom0, 
> IP2 for DomU,
> Dom0 needs clock work at 20Hz, but DomU want clock work at 40Hz. pvclk can 
> not avoid this.

But you mustn't allow a DomU to affect Dom0, nor may two DomU-s
interact in such a way with one another.

> If not using pvclk, we develop a new method to handle clock. I guest we can 
> also not avoid this?

At the very least it would need to be avoided by denying the request.
Upon shared use, either all parties agree, or only one may use the
clock. And passing through a (platform) device would therefore imply
validating that the needed clock(s) are available to the target domain.
Doing this in a consistent way with all control in one component's
hands seems doable only if hypervisor and/or tool stack are the
controlling (and arbitrating) entity. In the end this is one of the
reasons why to me a simple PV I/O interface doesn't seem suitable
here.

Jan

Stefano Stabellini Jan. 21, 2016, 12:55 p.m. UTC | #30

On Thu, 21 Jan 2016, Peng Fan wrote:
> Hi Ian,
> 
> On Thu, Jan 21, 2016 at 12:26:04PM +0000, Ian Campbell wrote:
> >On Thu, 2016-01-21 at 19:55 +0800, Peng Fan wrote:
> >> Hi Ian,
> >> 
> >> On Thu, Jan 21, 2016 at 10:19:32AM +0000, Ian Campbell wrote:
> >> > On Thu, 2016-01-21 at 16:59 +0800, Peng Fan wrote:
> >> > >  
> >> > > To platform device of ARM, hypervisor is responsible for the mapping
> >> > > between machine address and guest physical address, also responsible
> >> > > for the irq mapping.
> >> > > 
> >> > > But to embedded ARM SoC, the hardware clk IP is handled in Dom0.
> >> > 
> >> > Arguably Xen ought to be the one to do this, but we have determined
> >> > (rightly, I think) that doing so for the entirely clk tree of an SoC
> >> > would
> >> > involve importing an unmanageable amount of code into Xen[0].
> >> > 
> >> > In the meantime we defer this to Dom0.
> >> > 
> >> > I wonder though if it would be possible to manage the clocks for a
> >> > passthrough device from the toolstack, i.e. is there a sysfs node where
> >> > one
> >> > can say "keep the clock for this device enabled (at xMhz) even though
> >> > you
> >> > think the device is unused"?
> >> 
> >> I am afraid not. The linux device driver without xen can work well,
> >> because
> >> there is clk tree and "clk_get,clk_prepare,clk_set_rate" work well in the
> >> driver.
> >> I do not want to remove the clk apis usage from the device driver for
> >> xen, because the driver
> >> also serves when no xen support. The pvclk interface is mainly to let the
> >> clk api can work as no xen.
> >
> >Would adding a dummy fixed-clock[0] (or several of them) to the guest
> >passthrough DT satisfy the driver's requirements? They would be hardcoded
> >to whatever rate dom0 and/or the tools has decided upon (and had set in the
> >real h/w).
> 
> If using this way, we have the assumption that DomU device driver would not
> change the rate of the clock driving the device. I am not sure whether this is
> ok for so many platform devices based ARM core.
> 
> In /sys/kernel/debug/clk/...., there are clk tree info, but
> clk api are not exposed to userspace as far as I know, so
> if using sysfs interface to set a known fixed rate or enable/disable the clock,
> we need to expose the clk info to userspace.

Keeping in mind that we might now want to let DomU change the actual
clock frequency anyway, exposing a dummy clock to keep the DomU driver
happy might be a good option.

However whether we set the frequency from the Dom0 kernel or from the
toolstack, how do we find out the right clock rate? From a grep in the
kernel sources, it looks like some drivers still have hardcoded values.

Asking the user to provide the clock rate seems a bit too much to me.


> Jan said using hypercall in the other mail, do you have any ideas about this?

I recall having discussed this in the past and the conclusion was that
moving the clock framework into Xen, so that the hypervisor could
directly control clock frequencies, although it might make sense from an
architectural point of view, would be too complex to be feasible.

Ian Campbell Jan. 21, 2016, 1:11 p.m. UTC | #31

On Thu, 2016-01-21 at 12:55 +0000, Stefano Stabellini wrote:
> On Thu, 21 Jan 2016, Peng Fan wrote:
> > Hi Ian,
> > 
> > On Thu, Jan 21, 2016 at 12:26:04PM +0000, Ian Campbell wrote:
> > > On Thu, 2016-01-21 at 19:55 +0800, Peng Fan wrote:
> > > > Hi Ian,
> > > > 
> > > > On Thu, Jan 21, 2016 at 10:19:32AM +0000, Ian Campbell wrote:
> > > > > On Thu, 2016-01-21 at 16:59 +0800, Peng Fan wrote:
> > > > > >  
> > > > > > To platform device of ARM, hypervisor is responsible for the
> > > > > > mapping
> > > > > > between machine address and guest physical address, also
> > > > > > responsible
> > > > > > for the irq mapping.
> > > > > > 
> > > > > > But to embedded ARM SoC, the hardware clk IP is handled in
> > > > > > Dom0.
> > > > > 
> > > > > Arguably Xen ought to be the one to do this, but we have
> > > > > determined
> > > > > (rightly, I think) that doing so for the entirely clk tree of an
> > > > > SoC
> > > > > would
> > > > > involve importing an unmanageable amount of code into Xen[0].
> > > > > 
> > > > > In the meantime we defer this to Dom0.
> > > > > 
> > > > > I wonder though if it would be possible to manage the clocks for
> > > > > a
> > > > > passthrough device from the toolstack, i.e. is there a sysfs node
> > > > > where
> > > > > one
> > > > > can say "keep the clock for this device enabled (at xMhz) even
> > > > > though
> > > > > you
> > > > > think the device is unused"?
> > > > 
> > > > I am afraid not. The linux device driver without xen can work well,
> > > > because
> > > > there is clk tree and "clk_get,clk_prepare,clk_set_rate" work well
> > > > in the
> > > > driver.
> > > > I do not want to remove the clk apis usage from the device driver
> > > > for
> > > > xen, because the driver
> > > > also serves when no xen support. The pvclk interface is mainly to
> > > > let the
> > > > clk api can work as no xen.
> > > 
> > > Would adding a dummy fixed-clock[0] (or several of them) to the guest
> > > passthrough DT satisfy the driver's requirements? They would be
> > > hardcoded
> > > to whatever rate dom0 and/or the tools has decided upon (and had set
> > > in the
> > > real h/w).
> > 
> > If using this way, we have the assumption that DomU device driver would
> > not
> > change the rate of the clock driving the device. I am not sure whether
> > this is
> > ok for so many platform devices based ARM core.
> > 
> > In /sys/kernel/debug/clk/...., there are clk tree info, but
> > clk api are not exposed to userspace as far as I know, so
> > if using sysfs interface to set a known fixed rate or enable/disable
> > the clock,
> > we need to expose the clk info to userspace.
> 
> Keeping in mind that we might now want to let DomU change the actual

s/now/not/?

> clock frequency anyway, exposing a dummy clock to keep the DomU driver
> happy might be a good option.
> 
> However whether we set the frequency from the Dom0 kernel or from the
> toolstack, how do we find out the right clock rate? From a grep in the
> kernel sources, it looks like some drivers still have hardcoded values.
> 
> Asking the user to provide the clock rate seems a bit too much to me.

Remember that for this use case they already need to provide the host DT
path to the device, the list of iomem resources, the list of irqs and a
suitable device tree snippet.

platform device passthrough is very much an "expert" option (intended for
folks building embedded device, not really for normal end users), which
already needs a fair bit of familiarity with the system in question, I
don't think providing some info about the clocks goes too much further than
this.

Seamless & easy passthrough is something we are aiming for for PCI, but not
for the general case of platform passthrough.

Making platform device passthrough as easy as PCI passthrough is much
harder and far reaching than avoiding the need to specify some clock
frequencies (remember we had very long discussions about the design and
couldn't come up with anything especially feasible).

Ian.

Stefano Stabellini Jan. 21, 2016, 4:11 p.m. UTC | #32

On Thu, 21 Jan 2016, Ian Campbell wrote:
> On Thu, 2016-01-21 at 12:55 +0000, Stefano Stabellini wrote:
> > On Thu, 21 Jan 2016, Peng Fan wrote:
> > > Hi Ian,
> > > 
> > > On Thu, Jan 21, 2016 at 12:26:04PM +0000, Ian Campbell wrote:
> > > > On Thu, 2016-01-21 at 19:55 +0800, Peng Fan wrote:
> > > > > Hi Ian,
> > > > > 
> > > > > On Thu, Jan 21, 2016 at 10:19:32AM +0000, Ian Campbell wrote:
> > > > > > On Thu, 2016-01-21 at 16:59 +0800, Peng Fan wrote:
> > > > > > >  
> > > > > > > To platform device of ARM, hypervisor is responsible for the
> > > > > > > mapping
> > > > > > > between machine address and guest physical address, also
> > > > > > > responsible
> > > > > > > for the irq mapping.
> > > > > > > 
> > > > > > > But to embedded ARM SoC, the hardware clk IP is handled in
> > > > > > > Dom0.
> > > > > > 
> > > > > > Arguably Xen ought to be the one to do this, but we have
> > > > > > determined
> > > > > > (rightly, I think) that doing so for the entirely clk tree of an
> > > > > > SoC
> > > > > > would
> > > > > > involve importing an unmanageable amount of code into Xen[0].
> > > > > > 
> > > > > > In the meantime we defer this to Dom0.
> > > > > > 
> > > > > > I wonder though if it would be possible to manage the clocks for
> > > > > > a
> > > > > > passthrough device from the toolstack, i.e. is there a sysfs node
> > > > > > where
> > > > > > one
> > > > > > can say "keep the clock for this device enabled (at xMhz) even
> > > > > > though
> > > > > > you
> > > > > > think the device is unused"?
> > > > > 
> > > > > I am afraid not. The linux device driver without xen can work well,
> > > > > because
> > > > > there is clk tree and "clk_get,clk_prepare,clk_set_rate" work well
> > > > > in the
> > > > > driver.
> > > > > I do not want to remove the clk apis usage from the device driver
> > > > > for
> > > > > xen, because the driver
> > > > > also serves when no xen support. The pvclk interface is mainly to
> > > > > let the
> > > > > clk api can work as no xen.
> > > > 
> > > > Would adding a dummy fixed-clock[0] (or several of them) to the guest
> > > > passthrough DT satisfy the driver's requirements? They would be
> > > > hardcoded
> > > > to whatever rate dom0 and/or the tools has decided upon (and had set
> > > > in the
> > > > real h/w).
> > > 
> > > If using this way, we have the assumption that DomU device driver would
> > > not
> > > change the rate of the clock driving the device. I am not sure whether
> > > this is
> > > ok for so many platform devices based ARM core.
> > > 
> > > In /sys/kernel/debug/clk/...., there are clk tree info, but
> > > clk api are not exposed to userspace as far as I know, so
> > > if using sysfs interface to set a known fixed rate or enable/disable
> > > the clock,
> > > we need to expose the clk info to userspace.
> > 
> > Keeping in mind that we might now want to let DomU change the actual
> 
> s/now/not/?

Sorry, I meant "we might not want to let DomU change" ...


> > clock frequency anyway, exposing a dummy clock to keep the DomU driver
> > happy might be a good option.
> > 
> > However whether we set the frequency from the Dom0 kernel or from the
> > toolstack, how do we find out the right clock rate? From a grep in the
> > kernel sources, it looks like some drivers still have hardcoded values.
> > 
> > Asking the user to provide the clock rate seems a bit too much to me.
> 
> Remember that for this use case they already need to provide the host DT
> path to the device, the list of iomem resources, the list of irqs and a
> suitable device tree snippet.
> 
> platform device passthrough is very much an "expert" option (intended for
> folks building embedded device, not really for normal end users), which
> already needs a fair bit of familiarity with the system in question, I
> don't think providing some info about the clocks goes too much further than
> this.
> 
> Seamless & easy passthrough is something we are aiming for for PCI, but not
> for the general case of platform passthrough.
> 
> Making platform device passthrough as easy as PCI passthrough is much
> harder and far reaching than avoiding the need to specify some clock
> frequencies (remember we had very long discussions about the design and
> couldn't come up with anything especially feasible).

I agree that passthrough of non-PCI devices is a feature for "experts"
and we don't have many good solutions to this problem.

But one thing is asking for a device tree snippet, another is asking
for the clock frequency: as an expert user, finding out the device tree
node can be done by reading the device tree, the docs or googling for
it, but how is a user supposed to find out the clock frequency?  I don't
think that asking expert users to grep for values in the Linux source tree 
should be an option. Is the frequency at least supposed to be documented
in the hardware manual?

Should we have a list of known devices and frequencies in libxl?  Maybe
not, but we could consider giving users the option to allow the DomU to
set the frequency. Maybe we could introduce something like:
"xen,passthrough,allow-clock-change".

I am not arguing for one specific solution, I am just brainstorming.

Peng Fan Jan. 22, 2016, 1:56 a.m. UTC | #33

Hi Jan,

On Thu, Jan 21, 2016 at 05:52:12AM -0700, Jan Beulich wrote:
>>>> On 21.01.16 at 13:06, <van.freenix@gmail.com> wrote:
>> On Thu, Jan 21, 2016 at 03:21:38AM -0700, Jan Beulich wrote:
>>>>>> On 21.01.16 at 09:59, <van.freenix@gmail.com> wrote:
>>>> uart2 needs clock IMX7D_UART2_ROOT_CLK from the ccm.
>>>> passthrough uart2, hypervisor handles the reg and interrupts, that is 
>>>> because
>>>> hypervisor handles the memory map and the interrupt controller(GIC). But 
>>>> here
>>>> CCM is not handled by hypervisor, it is handled by Dom0.
>>>
>>>This, I take it, describes the current state, which doesn't make clear
>>>whether this is intentionally that way (and can't be changed), or
>>>just happens to be that way because so far it didn't matter.
>>>
>>>> For ARMV8 server products, I am not sure whether hypercall is better; but to
>>>> my case, it's not feasible to use hypercall.
>>>
>>>Because of ...?
>> 
>> I guess you mean DomU issues hypercall and Xen forwards the request to Dom0,
>> then Dom0 reply the response?
>
>Well, no, that wouldn't be a desirable (or even sane) model.
>
>>>> Dom0 handles all the clocks, DomU just send request to Dom0 and ask Dom0 to
>>>> enable/disable/set rate for a clock for the device. So I think it's okay
>>>> for multipile parties, the clk subsystem in Dom0 can handle mutiple requests 
>>>> even if the clock is shared.
>>>
>>>I.e. if one party sets one rate, and later another party sets
>>>a different rate, things are going to work fine? If so, why would
>>>the different parties need control over the rate in the first place?
>> 
>> oh. thanks for teaching me. If two IPs share one clock, and IP1 for Dom0, 
>> IP2 for DomU,
>> Dom0 needs clock work at 20Hz, but DomU want clock work at 40Hz. pvclk can 
>> not avoid this.
>
>But you mustn't allow a DomU to affect Dom0, nor may two DomU-s
>interact in such a way with one another.
>
>> If not using pvclk, we develop a new method to handle clock. I guest we can 
>> also not avoid this?
>
>At the very least it would need to be avoided by denying the request.
>Upon shared use, either all parties agree, or only one may use the
>clock. And passing through a (platform) device would therefore imply
>validating that the needed clock(s) are available to the target domain.
>Doing this in a consistent way with all control in one component's
>hands seems doable only if hypervisor and/or tool stack are the
>controlling (and arbitrating) entity. In the end this is one of the
>reasons why to me a simple PV I/O interface doesn't seem suitable
>here.

How about let userspace libxl pvclk code to denying the request?

If the pvclk interface is not desirable, I have no more idea on this for now(:

Thanks,
Peng.
>
>Jan
>

Peng Fan Jan. 22, 2016, 2:19 a.m. UTC | #34

Hi Ian,

On Thu, Jan 21, 2016 at 12:49:24PM +0000, Ian Campbell wrote:
>On Thu, 2016-01-21 at 20:35 +0800, Peng Fan wrote:
>> On Thu, Jan 21, 2016 at 12:26:04PM +0000, Ian Campbell wrote:
>> > Would adding a dummy fixed-clock[0] (or several of them) to the guest
>> > passthrough DT satisfy the driver's requirements? They would be hardcoded
>> > to whatever rate dom0 and/or the tools has decided upon (and had set in the
>> > real h/w).
>> 
>> If using this way, we have the assumption that DomU device driver would not
>> change the rate of the clock driving the device. I am not sure whether this is
>> ok for so many platform devices based ARM core.
>
>Don't (non-buggy) drivers already need to cope with the possibility that
>the clk core may not be able to satisfy their request for a particular rate
>due to constraints from other ip blocks?

Yeah. the drivers should be ready to cope with this.

>
>I would also expect the user to want to configure things in dom0 such that
>the specific drivers used in domU are satisfied with what they get (which
>is a bit fiddly perhaps, but platform device passthrough already is
>somewhat that way).

Let user to configure such as pinctrl and clk for the platform device in Dom0.
But the clk is set at a fixed rate and let the device in DomU use it.
To embedded products, power usage is sensitive. If I passed through GPU controller
to DomU, and fix the clock rate at 400M in Dom0, it will consume more power than let DomU
change the rate. Also platform device passthrough is not hotplugable now, this means
that the gpu controller will always be alive and clock is always enabled.
I am not sure whether this is acceptable or not.

>
>> In /sys/kernel/debug/clk/...., there are clk tree info, but
>> clk api are not exposed to userspace as far as I know, so
>> if using sysfs interface to set a known fixed rate or enable/disable the clock,
>> we need to expose the clk info to userspace.
>
>That seems possible to arrange given a use case for it though, a SMOP (and
>convincing the clk maintainer of the need).
>
>Worst case is a xen-clkback driver or perhaps even vfio will want to do
>something like this, we can't normally use vfio on Xen, but in this case
>perhaps we could.

vfio is new to me. Just google it. iommu is not a must for platform device passthrough.
If the platform device supports DMA, then smmu is a must.

>
>
>> Jan said using hypercall in the other mail, do you have any ideas about
>> this?
>
>This would need a whole clock infrastructure in Xen, wouldn't it? I
>described why that is not currently available in an earlier mail, and also
>my opinion that doing the whole thing in Xen would involve pulling in far
>too much SoC specific code for each specific platform.

Thanks for clarifying.

Thanks,
Peng.

>
>Ian.
>>

Peng Fan Jan. 22, 2016, 2:51 a.m. UTC | #35

Hi Ian and Stefano,

On Thu, Jan 21, 2016 at 04:11:45PM +0000, Stefano Stabellini wrote:
>On Thu, 21 Jan 2016, Ian Campbell wrote:
>> On Thu, 2016-01-21 at 12:55 +0000, Stefano Stabellini wrote:
>> > On Thu, 21 Jan 2016, Peng Fan wrote:
>> > > Hi Ian,
>> > > 
>> > > On Thu, Jan 21, 2016 at 12:26:04PM +0000, Ian Campbell wrote:
>> > > > On Thu, 2016-01-21 at 19:55 +0800, Peng Fan wrote:
>> > > > > Hi Ian,
>> > > > > 
>> > > > > On Thu, Jan 21, 2016 at 10:19:32AM +0000, Ian Campbell wrote:
>> > > > > > On Thu, 2016-01-21 at 16:59 +0800, Peng Fan wrote:
>> > > > > > >  
>> > > > > > > To platform device of ARM, hypervisor is responsible for the
>> > > > > > > mapping
>> > > > > > > between machine address and guest physical address, also
>> > > > > > > responsible
>> > > > > > > for the irq mapping.
>> > > > > > > 
>> > > > > > > But to embedded ARM SoC, the hardware clk IP is handled in
>> > > > > > > Dom0.
>> > > > > > 
>> > > > > > Arguably Xen ought to be the one to do this, but we have
>> > > > > > determined
>> > > > > > (rightly, I think) that doing so for the entirely clk tree of an
>> > > > > > SoC
>> > > > > > would
>> > > > > > involve importing an unmanageable amount of code into Xen[0].
>> > > > > > 
>> > > > > > In the meantime we defer this to Dom0.
>> > > > > > 
>> > > > > > I wonder though if it would be possible to manage the clocks for
>> > > > > > a
>> > > > > > passthrough device from the toolstack, i.e. is there a sysfs node
>> > > > > > where
>> > > > > > one
>> > > > > > can say "keep the clock for this device enabled (at xMhz) even
>> > > > > > though
>> > > > > > you
>> > > > > > think the device is unused"?
>> > > > > 
>> > > > > I am afraid not. The linux device driver without xen can work well,
>> > > > > because
>> > > > > there is clk tree and "clk_get,clk_prepare,clk_set_rate" work well
>> > > > > in the
>> > > > > driver.
>> > > > > I do not want to remove the clk apis usage from the device driver
>> > > > > for
>> > > > > xen, because the driver
>> > > > > also serves when no xen support. The pvclk interface is mainly to
>> > > > > let the
>> > > > > clk api can work as no xen.
>> > > > 
>> > > > Would adding a dummy fixed-clock[0] (or several of them) to the guest
>> > > > passthrough DT satisfy the driver's requirements? They would be
>> > > > hardcoded
>> > > > to whatever rate dom0 and/or the tools has decided upon (and had set
>> > > > in the
>> > > > real h/w).
>> > > 
>> > > If using this way, we have the assumption that DomU device driver would
>> > > not
>> > > change the rate of the clock driving the device. I am not sure whether
>> > > this is
>> > > ok for so many platform devices based ARM core.
>> > > 
>> > > In /sys/kernel/debug/clk/...., there are clk tree info, but
>> > > clk api are not exposed to userspace as far as I know, so
>> > > if using sysfs interface to set a known fixed rate or enable/disable
>> > > the clock,
>> > > we need to expose the clk info to userspace.
>> > 
>> > Keeping in mind that we might now want to let DomU change the actual
>> 
>> s/now/not/?
>
>Sorry, I meant "we might not want to let DomU change" ...

Why not want to let DomU change? I can not get this.

>
>
>> > clock frequency anyway, exposing a dummy clock to keep the DomU driver
>> > happy might be a good option.
>> > 
>> > However whether we set the frequency from the Dom0 kernel or from the
>> > toolstack, how do we find out the right clock rate? From a grep in the
>> > kernel sources, it looks like some drivers still have hardcoded values.
>> > 
>> > Asking the user to provide the clock rate seems a bit too much to me.
>> 
>> Remember that for this use case they already need to provide the host DT
>> path to the device, the list of iomem resources, the list of irqs and a
>> suitable device tree snippet.
>> 
>> platform device passthrough is very much an "expert" option (intended for
>> folks building embedded device, not really for normal end users), which
>> already needs a fair bit of familiarity with the system in question, I
>> don't think providing some info about the clocks goes too much further than
>> this.
>> 
>> Seamless & easy passthrough is something we are aiming for for PCI, but not
>> for the general case of platform passthrough.
>> 
>> Making platform device passthrough as easy as PCI passthrough is much
>> harder and far reaching than avoiding the need to specify some clock
>> frequencies (remember we had very long discussions about the design and
>> couldn't come up with anything especially feasible).
>
>I agree that passthrough of non-PCI devices is a feature for "experts"
>and we don't have many good solutions to this problem.
>
>But one thing is asking for a device tree snippet, another is asking
>for the clock frequency: as an expert user, finding out the device tree
>node can be done by reading the device tree, the docs or googling for
>it, but how is a user supposed to find out the clock frequency?  I don't
>think that asking expert users to grep for values in the Linux source tree 
>should be an option. Is the frequency at least supposed to be documented
>in the hardware manual?

I just asked our module owners about the clock rate. The rate for different
device is fixed most of the time, except runs into suspend state.
And the value for the rate is recommended by IC owner.

I think I can try the fixed clock way, if this is the better one, and we can ignore the runtime device clock change.

1. expose the kernel clk api to userspace, dir /sys/kernel/debug/clk/xxx
2. Introduce a entry in xl configuration file, like this clks=["name=uart_root_clk, rate=24000000", "name=gpu_root_clk,rate=400000000"].
2. Let libxl parse the clks entry and construct the clock node for DomU dts, and marked as fixed clock.
3. In DomU kernel, wrote a xen clk driver to serve the clk api usage in platform device drivers.

I am not sure whether vfio can be used here or not.

Any comments on this?

Thanks,
Peng.

>Should we have a list of known devices and frequencies in libxl?  Maybe
>not, but we could consider giving users the option to allow the DomU to
>set the frequency. Maybe we could introduce something like:
>"xen,passthrough,allow-clock-change".
>
>I am not arguing for one specific solution, I am just brainstorming.

Jan Beulich Jan. 22, 2016, 7:36 a.m. UTC | #36

>>> On 22.01.16 at 02:56, <van.freenix@gmail.com> wrote:
> On Thu, Jan 21, 2016 at 05:52:12AM -0700, Jan Beulich wrote:
>>At the very least it would need to be avoided by denying the request.
>>Upon shared use, either all parties agree, or only one may use the
>>clock. And passing through a (platform) device would therefore imply
>>validating that the needed clock(s) are available to the target domain.
>>Doing this in a consistent way with all control in one component's
>>hands seems doable only if hypervisor and/or tool stack are the
>>controlling (and arbitrating) entity. In the end this is one of the
>>reasons why to me a simple PV I/O interface doesn't seem suitable
>>here.
> 
> How about let userspace libxl pvclk code to denying the request?

Userspace would be fine, but
- How would this fit in your frontend/backend model, where
  userspace shouldn't be involved at all?
- Libxl may be a little too high up the stack, libxc would seem a
  more appropriate place to me (but that's subject to tools
  maintainers disagreeing with me).

Jan

Peng Fan Jan. 22, 2016, 9:27 a.m. UTC | #37

Hi Jan,

On Fri, Jan 22, 2016 at 12:36:31AM -0700, Jan Beulich wrote:
>>>> On 22.01.16 at 02:56, <van.freenix@gmail.com> wrote:
>> On Thu, Jan 21, 2016 at 05:52:12AM -0700, Jan Beulich wrote:
>>>At the very least it would need to be avoided by denying the request.
>>>Upon shared use, either all parties agree, or only one may use the
>>>clock. And passing through a (platform) device would therefore imply
>>>validating that the needed clock(s) are available to the target domain.
>>>Doing this in a consistent way with all control in one component's
>>>hands seems doable only if hypervisor and/or tool stack are the
>>>controlling (and arbitrating) entity. In the end this is one of the
>>>reasons why to me a simple PV I/O interface doesn't seem suitable
>>>here.
>> 
>> How about let userspace libxl pvclk code to denying the request?
>
>Userspace would be fine, but

Then you are ok with the pvclk way to handle clock for platform device passthrough?

>- How would this fit in your frontend/backend model, where
>  userspace shouldn't be involved at all?

rethought about this. clk is binded to device. we can not passthrough
one device to two guest, so this means we can not let two different
guest access one clk input. Since this is mainly for embedded products,
just as Ian said "experts option", the developer should be aware of
the clk sharing between two device.

If we truly need to let userspace deny the request. If one clk
already assigned to Dom1, then the toolstack need to fail
the creation of Dom2, if Dom2 want to use the same clock.

In xl configuraiton file for Dom1, introduce such an entry,
vclks=["gpu_root_clk,uart2_root_clk,usdhc2_root_clk"]
and toolstack will create the xenstore nodes.
/local/domain/0/backend/vclk/1/0/nrclks-->3
/local/domain/0/backend/vclk/1/0/clk-0/name-->gpu_root_clk
/local/domain/0/backend/vclk/1/0/clk-1/name-->uart2_root_clk
/local/domain/0/backend/vclk/1/0/clk-2/name-->usdhc2_root_clk

/local/domain/1/device/vclk/0/clk-ring-ref
/local/domain/1/device/vclk/0/event-channel

The DomU dts also contains the clk names. Maybe the dts for clk node can 
be created by the toolstack.

If Dom2 also want to use gpu_root_clk, it should be blocked by toolstack.

Thanks,
Peng.
>- Libxl may be a little too high up the stack, libxc would seem a
>  more appropriate place to me (but that's subject to tools
>  maintainers disagreeing with me).
>
>Jan
>

Jan Beulich Jan. 22, 2016, 10:25 a.m. UTC | #38

>>> On 22.01.16 at 10:27, <van.freenix@gmail.com> wrote:
> Hi Jan,
> 
> On Fri, Jan 22, 2016 at 12:36:31AM -0700, Jan Beulich wrote:
>>>>> On 22.01.16 at 02:56, <van.freenix@gmail.com> wrote:
>>> On Thu, Jan 21, 2016 at 05:52:12AM -0700, Jan Beulich wrote:
>>>>At the very least it would need to be avoided by denying the request.
>>>>Upon shared use, either all parties agree, or only one may use the
>>>>clock. And passing through a (platform) device would therefore imply
>>>>validating that the needed clock(s) are available to the target domain.
>>>>Doing this in a consistent way with all control in one component's
>>>>hands seems doable only if hypervisor and/or tool stack are the
>>>>controlling (and arbitrating) entity. In the end this is one of the
>>>>reasons why to me a simple PV I/O interface doesn't seem suitable
>>>>here.
>>> 
>>> How about let userspace libxl pvclk code to denying the request?
>>
>>Userspace would be fine, but
> 
> Then you are ok with the pvclk way to handle clock for platform device 
> passthrough?

No, not really. While I accept that doing clock management in the
hypervisor is undesirable, we're still not at the point where such
a frontend/backend pair would look like the only possible route
out of a dilemma, and I continue to think that this proposed model
should indeed only be the last resort.

In particular, with the user space exposure of clock control
discussed in another sub-thread, the next best option would
seem to be to handle this via emulation in a device model. Yes,
ARM guests currently have no qemu attached to them, but I
guess sooner or later this will need to change anyway.

>>- How would this fit in your frontend/backend model, where
>>  userspace shouldn't be involved at all?
> 
> rethought about this. clk is binded to device. we can not passthrough
> one device to two guest, so this means we can not let two different
> guest access one clk input. Since this is mainly for embedded products,
> just as Ian said "experts option", the developer should be aware of
> the clk sharing between two device.
> 
> If we truly need to let userspace deny the request. If one clk
> already assigned to Dom1, then the toolstack need to fail
> the creation of Dom2, if Dom2 want to use the same clock.

I.e. you're now proposing actual assignment of clocks to guests?
That's at least one step in the (from my pov) right direction...

Jan

Peng Fan Jan. 22, 2016, 12:12 p.m. UTC | #39

Hi Jan,

On Fri, Jan 22, 2016 at 03:25:40AM -0700, Jan Beulich wrote:
>>>> On 22.01.16 at 10:27, <van.freenix@gmail.com> wrote:
>> Hi Jan,
>> 
>> On Fri, Jan 22, 2016 at 12:36:31AM -0700, Jan Beulich wrote:
>>>>>> On 22.01.16 at 02:56, <van.freenix@gmail.com> wrote:
>>>> On Thu, Jan 21, 2016 at 05:52:12AM -0700, Jan Beulich wrote:
>>>>>At the very least it would need to be avoided by denying the request.
>>>>>Upon shared use, either all parties agree, or only one may use the
>>>>>clock. And passing through a (platform) device would therefore imply
>>>>>validating that the needed clock(s) are available to the target domain.
>>>>>Doing this in a consistent way with all control in one component's
>>>>>hands seems doable only if hypervisor and/or tool stack are the
>>>>>controlling (and arbitrating) entity. In the end this is one of the
>>>>>reasons why to me a simple PV I/O interface doesn't seem suitable
>>>>>here.
>>>> 
>>>> How about let userspace libxl pvclk code to denying the request?
>>>
>>>Userspace would be fine, but
>> 
>> Then you are ok with the pvclk way to handle clock for platform device 
>> passthrough?
>
>No, not really. While I accept that doing clock management in the
>hypervisor is undesirable, we're still not at the point where such
>a frontend/backend pair would look like the only possible route
>out of a dilemma, and I continue to think that this proposed model
>should indeed only be the last resort.

Thanks for following the thread and giving comments.
Alougth this frustrate me, we still need to find a better option for this.

>
>In particular, with the user space exposure of clock control
>discussed in another sub-thread, the next best option would
>seem to be to handle this via emulation in a device model. Yes,
>ARM guests currently have no qemu attached to them, but I
>guess sooner or later this will need to change anyway.

I have not look into qemu for xen.
If using qemu, then we still need to expose the clk interface to userspace?

>
>>>- How would this fit in your frontend/backend model, where
>>>  userspace shouldn't be involved at all?
>> 
>> rethought about this. clk is binded to device. we can not passthrough
>> one device to two guest, so this means we can not let two different
>> guest access one clk input. Since this is mainly for embedded products,
>> just as Ian said "experts option", the developer should be aware of
>> the clk sharing between two device.
>> 
>> If we truly need to let userspace deny the request. If one clk
>> already assigned to Dom1, then the toolstack need to fail
>> the creation of Dom2, if Dom2 want to use the same clock.
>
>I.e. you're now proposing actual assignment of clocks to guests?
>That's at least one step in the (from my pov) right direction...

Based on the pvclk, I am coding the userspace tool part. Alought we have
not find a good solution for this, I first need it work on my platform.

Later I'll also try the fixed clock way.

Thanks,
Peng.
>
>Jan
>

Jan Beulich Jan. 22, 2016, 12:33 p.m. UTC | #40

>>> On 22.01.16 at 13:12, <van.freenix@gmail.com> wrote:
> On Fri, Jan 22, 2016 at 03:25:40AM -0700, Jan Beulich wrote:
>>In particular, with the user space exposure of clock control
>>discussed in another sub-thread, the next best option would
>>seem to be to handle this via emulation in a device model. Yes,
>>ARM guests currently have no qemu attached to them, but I
>>guess sooner or later this will need to change anyway.
> 
> I have not look into qemu for xen.
> If using qemu, then we still need to expose the clk interface to userspace?

I think so, yes. In fact - see above - the discussed userspace
exposure made me consider this option.

Jan

Stefano Stabellini Jan. 22, 2016, 1:55 p.m. UTC | #41

On Fri, 22 Jan 2016, Jan Beulich wrote:
> >>> On 22.01.16 at 13:12, <van.freenix@gmail.com> wrote:
> > On Fri, Jan 22, 2016 at 03:25:40AM -0700, Jan Beulich wrote:
> >>In particular, with the user space exposure of clock control
> >>discussed in another sub-thread, the next best option would
> >>seem to be to handle this via emulation in a device model. Yes,
> >>ARM guests currently have no qemu attached to them, but I
> >>guess sooner or later this will need to change anyway.
> > 
> > I have not look into qemu for xen.
> > If using qemu, then we still need to expose the clk interface to userspace?
> 
> I think so, yes. In fact - see above - the discussed userspace
> exposure made me consider this option.

I would think twice before introducing QEMU on Xen on ARM for many
reasons. I don't think that a simple clock is worth the downsides.

If we really have to emulate it, we could do that in Xen. We could have
an extremely simple clock that only takes one or two commands.

Peng Fan Jan. 23, 2016, 3:26 p.m. UTC | #42

Hi Ian,

On Thu, Jan 21, 2016 at 12:49:24PM +0000, Ian Campbell wrote:
>On Thu, 2016-01-21 at 20:35 +0800, Peng Fan wrote:
>> On Thu, Jan 21, 2016 at 12:26:04PM +0000, Ian Campbell wrote:
>> > Would adding a dummy fixed-clock[0] (or several of them) to the guest
>> > passthrough DT satisfy the driver's requirements? They would be hardcoded
>> > to whatever rate dom0 and/or the tools has decided upon (and had set in the
>> > real h/w).
>> 
>> If using this way, we have the assumption that DomU device driver would not
>> change the rate of the clock driving the device. I am not sure whether this is
>> ok for so many platform devices based ARM core.
>
>Don't (non-buggy) drivers already need to cope with the possibility that
>the clk core may not be able to satisfy their request for a particular rate
>due to constraints from other ip blocks?
>
>I would also expect the user to want to configure things in dom0 such that
>the specific drivers used in domU are satisfied with what they get (which
>is a bit fiddly perhaps, but platform device passthrough already is
>somewhat that way).
>
>> In /sys/kernel/debug/clk/...., there are clk tree info, but
>> clk api are not exposed to userspace as far as I know, so
>> if using sysfs interface to set a known fixed rate or enable/disable the clock,
>> we need to expose the clk info to userspace.
>
>That seems possible to arrange given a use case for it though, a SMOP (and
>convincing the clk maintainer of the need).

Sorry to bother you again on this.

For the fixed clock method, I add such a xl cfg entry: devclk=["uart_root_clk,24000000", "gpu_root_clk,200000000"].
After add code for writeing sysfs clk node in libxl, the clk can be set rate/prepared/enabled.
But I do not know the resource release path, when need to shutdown/destroy a domain. For example:
when domain creation, the clk is prepared, when domain destroied, I want the clk unprepared. But I
do not find out the code path to releasing resource.

Also, since need to add fixed-clock nodes in DomU dts, do you agree to let libxl or xc to create the nodes
in dts, but not mannually add them in passthrough node?

Thanks,
Peng.

>
>Worst case is a xen-clkback driver or perhaps even vfio will want to do
>something like this, we can't normally use vfio on Xen, but in this case
>perhaps we could.
>
>
>> Jan said using hypercall in the other mail, do you have any ideas about
>> this?
>
>This would need a whole clock infrastructure in Xen, wouldn't it? I
>described why that is not currently available in an earlier mail, and also
>my opinion that doing the whole thing in Xen would involve pulling in far
>too much SoC specific code for each specific platform.
>
>Ian.
>>

[RFC,V2] xen: interface: introduce pvclk interface

Commit Message

Comments

Patch