mbox series

[V2,00/10] unify the interface of the proportional-share policy in blkio/io

Message ID 20181119103424.3853-1-paolo.valente@linaro.org (mailing list archive)
Headers show
Series unify the interface of the proportional-share policy in blkio/io | expand

Message

Paolo Valente Nov. 19, 2018, 10:34 a.m. UTC
Hi,
here is the V2 of this patch series. Let me rephrase the description
of the series, in view of the fact that CFQ will be gone with legacy
block.

The current implementation of cgroups doesn't allow two or more
entities, e.g., I/O schedulers, to share the same files.  Thus, to
enable people to set group weights with BFQ, I resorted to making BFQ
create its own version of the same interface files used by CFQ (before
going away with legacy block), by prepending a bfq prefix.

Actually, no legacy code uses these different names, or is likely to
do so.  Having these two sets of names is simply a source of
confusion, as pointed out also, e.g., by Lennart Poettering (CCed
here), and acknowledged by Tejun [2].

In [1] we agreed on a solution that solves this problem, by actually
making it possible to share cgroups files.  Both writing to and
reading from a shared file trigger the appropriate operation for each
of the entities that share the file.  In particular, in case of
reading,
- if all entities produce the same output, the this common output is
 shown only once;
- if the outputs differ, then every per-entity output is shown,
 followed by the name of the entity that produced that output.

With this solution, legacy code that, e.g., sets group weights, just
works, regardless of the I/O scheduler actually implementing
proportional share.

But note that this extension is not restricted to only blkio/io.  The
general group interface now enables files to be shared among multiple
entities of any kind.

(I have also added a patch to fix some clerical errors in bfq doc,
which I found while making the latter consistent with the new
interface.)

CHANGES FROM V1:
- Removed patch that introduced a function to only find kernfs nodes,
  without increasing ref counters
- Changed commit messages and BFQ documentation, to comply with the
  fact that there won't be CFQ any longer

Thanks,
Paolo

Angelo Ruocco (5):
  cgroup: link cftypes of the same subsystem with the same name
  cgroup: add owner name to cftypes
  block, bfq: align min and default weights with the old cfq default
  cgroup: make all functions of all cftypes be invoked
  block, throttle: allow sharing cgroup statistic files

Paolo Valente (5):
  cgroup: add hook seq_show_cft with also the owning cftype as parameter
  block, cgroup: pass cftype to functions that need to use it
  block, bfq: use standard file names for the proportional-share policy
  doc, bfq-iosched: fix a few clerical errors
  doc, bfq-iosched: make it consistent with the new cgroup interface

 Documentation/block/bfq-iosched.txt |  34 ++---
 block/bfq-cgroup.c                  | 148 +++++++++++++-------
 block/bfq-iosched.h                 |   4 +-
 block/blk-cgroup.c                  |  22 +--
 block/blk-throttle.c                |  24 ++--
 include/linux/blk-cgroup.h          |  10 +-
 include/linux/cgroup-defs.h         |  14 +-
 include/linux/cgroup.h              |  13 ++
 kernel/cgroup/cgroup.c              | 262 +++++++++++++++++++++++++++++-------
 9 files changed, 390 insertions(+), 141 deletions(-)

--
2.16.1

Comments

Tejun Heo Nov. 20, 2018, 4:28 p.m. UTC | #1
Hello, Paolo.

On Mon, Nov 19, 2018 at 11:34:14AM +0100, Paolo Valente wrote:
> - if all entities produce the same output, the this common output is
>  shown only once;
> - if the outputs differ, then every per-entity output is shown,
>  followed by the name of the entity that produced that output.

So, this doesn't make sense to me.  One set of numbers is meaningful,
the other not and the user doesn't have a way to tell which one is.
It makes no sense to present both numbers.

Thanks.
Paolo Valente Nov. 20, 2018, 4:50 p.m. UTC | #2
> Il giorno 20 nov 2018, alle ore 17:28, Tejun Heo <tj@kernel.org> ha scritto:
> 
> Hello, Paolo.
> 
> On Mon, Nov 19, 2018 at 11:34:14AM +0100, Paolo Valente wrote:
>> - if all entities produce the same output, the this common output is
>> shown only once;
>> - if the outputs differ, then every per-entity output is shown,
>> followed by the name of the entity that produced that output.
> 
> So, this doesn't make sense to me.  One set of numbers is meaningful,
> the other not and the user doesn't have a way to tell which one is.
> It makes no sense to present both numbers.
> 

I do agree that these numbers may confuse.  Before discussing how to
do this better, let me tell you why we are showing them.

In the first place, the need for a diversified output showed up in the
following case.  Given a group using the blkio/io controller, and two
drives
- one with legacy block and cfq
- one with blk-mq and bfq
there will be different statistics for each scheduler, for the same
interface files.

Then we understood that exactly the same happens with throttling, in
case the latter is activated on different devices w.r.t. bfq.

In addition, the same may happen, in the near future, with the
bandwidth controller Josef is working on.  If the controller can be
configured per device, as with throttling, then statistics may differ,
for the same interface files, between bfq, throttling and that
controller.

More general examples could be made considering that this extension is
for the generic cgroup interface.

Of course, suggestions for a clearer way to show these numbers are
more than welcome!  Maybe involved device identifiers can be somehow
gathered by the entities providing these numbers, and then shown?  In
this respect, consider that, even without this extension, one still
has the fundamental problem of not knowing to which devices numbers
apply (unless I'm missing something else).

Thanks,
Paolo

> Thanks.
> 
> -- 
> tejun
Paolo Valente Nov. 30, 2018, 6:23 p.m. UTC | #3
> Il giorno 20 nov 2018, alle ore 17:50, Paolo Valente <paolo.valente@linaro.org> ha scritto:
> 
> 
> 
>> Il giorno 20 nov 2018, alle ore 17:28, Tejun Heo <tj@kernel.org> ha scritto:
>> 
>> Hello, Paolo.
>> 
>> On Mon, Nov 19, 2018 at 11:34:14AM +0100, Paolo Valente wrote:
>>> - if all entities produce the same output, the this common output is
>>> shown only once;
>>> - if the outputs differ, then every per-entity output is shown,
>>> followed by the name of the entity that produced that output.
>> 
>> So, this doesn't make sense to me.  One set of numbers is meaningful,
>> the other not and the user doesn't have a way to tell which one is.
>> It makes no sense to present both numbers.
>> 
> 
> I do agree that these numbers may confuse.  Before discussing how to
> do this better, let me tell you why we are showing them.
> 
> In the first place, the need for a diversified output showed up in the
> following case.  Given a group using the blkio/io controller, and two
> drives
> - one with legacy block and cfq
> - one with blk-mq and bfq
> there will be different statistics for each scheduler, for the same
> interface files.
> 
> Then we understood that exactly the same happens with throttling, in
> case the latter is activated on different devices w.r.t. bfq.
> 
> In addition, the same may happen, in the near future, with the
> bandwidth controller Josef is working on.  If the controller can be
> configured per device, as with throttling, then statistics may differ,
> for the same interface files, between bfq, throttling and that
> controller.
> 
> More general examples could be made considering that this extension is
> for the generic cgroup interface.
> 
> Of course, suggestions for a clearer way to show these numbers are
> more than welcome!  Maybe involved device identifiers can be somehow
> gathered by the entities providing these numbers, and then shown?  In
> this respect, consider that, even without this extension, one still
> has the fundamental problem of not knowing to which devices numbers
> apply (unless I'm missing something else).
> 

Hi Tejun,
have you had time to look into this?  Any improvement to this
interface is ok for us. We are only interested in finally solving this
interface issue, as, for what concerns us directly, it has been
preventing legacy code to use bfq for years.

Thanks,
Paolo



> Thanks,
> Paolo
> 
>> Thanks.
>> 
>> -- 
>> tejun
> 
> -- 
> You received this message because you are subscribed to the Google Groups "bfq-iosched" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to bfq-iosched+unsubscribe@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
Tejun Heo Nov. 30, 2018, 6:42 p.m. UTC | #4
Hello, Paolo.

On Fri, Nov 30, 2018 at 07:23:24PM +0100, Paolo Valente wrote:
> > Then we understood that exactly the same happens with throttling, in
> > case the latter is activated on different devices w.r.t. bfq.
> > 
> > In addition, the same may happen, in the near future, with the
> > bandwidth controller Josef is working on.  If the controller can be
> > configured per device, as with throttling, then statistics may differ,
> > for the same interface files, between bfq, throttling and that
> > controller.

So, regardless of how all these are implemented, what's presented to
user should be consistent and clear.  There's no other way around it.
Only what's relevant should be visible to userspace.

> have you had time to look into this?  Any improvement to this
> interface is ok for us. We are only interested in finally solving this
> interface issue, as, for what concerns us directly, it has been
> preventing legacy code to use bfq for years.

Unfortunately, I don't have any implementation proposal, but we can't
show things this way to userspace.

Thanks.
Paolo Valente Nov. 30, 2018, 6:53 p.m. UTC | #5
> Il giorno 30 nov 2018, alle ore 19:42, Tejun Heo <tj@kernel.org> ha scritto:
> 
> Hello, Paolo.
> 
> On Fri, Nov 30, 2018 at 07:23:24PM +0100, Paolo Valente wrote:
>>> Then we understood that exactly the same happens with throttling, in
>>> case the latter is activated on different devices w.r.t. bfq.
>>> 
>>> In addition, the same may happen, in the near future, with the
>>> bandwidth controller Josef is working on.  If the controller can be
>>> configured per device, as with throttling, then statistics may differ,
>>> for the same interface files, between bfq, throttling and that
>>> controller.
> 
> So, regardless of how all these are implemented, what's presented to
> user should be consistent and clear.  There's no other way around it.
> Only what's relevant should be visible to userspace.
> 
>> have you had time to look into this?  Any improvement to this
>> interface is ok for us. We are only interested in finally solving this
>> interface issue, as, for what concerns us directly, it has been
>> preventing legacy code to use bfq for years.
> 
> Unfortunately, I don't have any implementation proposal, but we can't
> show things this way to userspace.
> 

Well, this is not very helpful to move forward :)

Let me try to repeat the problem, to try to help you help us unblock
the situation.

If we have multiple entities attached to the same interface output
file, you don't find it clear that each entity shows the number it
wants to show.  But you have no idea either of how that differentiated
information should be shown.  Is this the situation, or is the problem
somewhere 'above' this level?

If the problem is as I described it, here are some proposal attempts:
1) Do you want file sharing to be allowed only if all entities will
output the same number?  (this seems excessive, but maybe it makes
sense)
2) Do you want only one number to be shown, equal to the sum of the
numbers of each entity?  (in some cases, this may make sense)
3) Do you prefer an average?
4) Do you have any other idea, even if just germinal?

Looking forward to your feedback,
Paolo 


> Thanks.
> 
> -- 
> tejun
> 
> -- 
> You received this message because you are subscribed to the Google Groups "bfq-iosched" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to bfq-iosched+unsubscribe@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
Angelo Ruocco Dec. 10, 2018, 1:45 p.m. UTC | #6
2018-11-30 19:53 GMT+01:00, Paolo Valente <paolo.valente@linaro.org>:
>
>
>> Il giorno 30 nov 2018, alle ore 19:42, Tejun Heo <tj@kernel.org> ha
>> scritto:
>>
>> Hello, Paolo.
>>
>> On Fri, Nov 30, 2018 at 07:23:24PM +0100, Paolo Valente wrote:
>>>> Then we understood that exactly the same happens with throttling, in
>>>> case the latter is activated on different devices w.r.t. bfq.
>>>>
>>>> In addition, the same may happen, in the near future, with the
>>>> bandwidth controller Josef is working on.  If the controller can be
>>>> configured per device, as with throttling, then statistics may differ,
>>>> for the same interface files, between bfq, throttling and that
>>>> controller.
>>
>> So, regardless of how all these are implemented, what's presented to
>> user should be consistent and clear.  There's no other way around it.
>> Only what's relevant should be visible to userspace.
>>
>>> have you had time to look into this?  Any improvement to this
>>> interface is ok for us. We are only interested in finally solving this
>>> interface issue, as, for what concerns us directly, it has been
>>> preventing legacy code to use bfq for years.
>>
>> Unfortunately, I don't have any implementation proposal, but we can't
>> show things this way to userspace.
>>
>
> Well, this is not very helpful to move forward :)
>
> Let me try to repeat the problem, to try to help you help us unblock
> the situation.
>
> If we have multiple entities attached to the same interface output
> file, you don't find it clear that each entity shows the number it
> wants to show.  But you have no idea either of how that differentiated
> information should be shown.  Is this the situation, or is the problem
> somewhere 'above' this level?
>
> If the problem is as I described it, here are some proposal attempts:
> 1) Do you want file sharing to be allowed only if all entities will
> output the same number?  (this seems excessive, but maybe it makes
> sense)
> 2) Do you want only one number to be shown, equal to the sum of the
> numbers of each entity?  (in some cases, this may make sense)
> 3) Do you prefer an average?
> 4) Do you have any other idea, even if just germinal?

To further add to what Paolo said and better expose the problem, I'd like to
say that all those proposals have issues.
If we only allow "same output" cftypes to be shared then we lose all the
flexibility of this solution, and we need a way for an entity to know other
entities internal variables beforehand, which sounds at least very hard, and
maybe is not even an acceptable thing to do.
To put the average, sum or some other mathematical function in the file only
makes sense for certain cftypes, so also doesn't sound like a good idea. In
fact I can think of scenarios where only seeing the different values of the
entities makes sense for a user.

I understand that the problem is inconsistency: having a file that behaves
differently depending on the situation, and the only way to prevent this I can
think of is to *always* show the entity owner of a certain file (or part of the
output), even when the output would be the same among entities or when the
file is not currently shared but could be. Can this be an acceptable solution?

Angelo

>
> Looking forward to your feedback,
> Paolo
>
>
>> Thanks.
>>
>> --
>> tejun
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "bfq-iosched" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to bfq-iosched+unsubscribe@googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
>
>
Paolo Valente Dec. 18, 2018, 7:48 a.m. UTC | #7
[RESENDING BECAUSE BOUNCED]

> Il giorno 10 dic 2018, alle ore 14:45, Angelo Ruocco <angelo.ruocco.90@gmail.com> ha scritto:
> 
> 2018-11-30 19:53 GMT+01:00, Paolo Valente <paolo.valente@linaro.org>:
>> 
>> 
>>> Il giorno 30 nov 2018, alle ore 19:42, Tejun Heo <tj@kernel.org> ha
>>> scritto:
>>> 
>>> Hello, Paolo.
>>> 
>>> On Fri, Nov 30, 2018 at 07:23:24PM +0100, Paolo Valente wrote:
>>>>> Then we understood that exactly the same happens with throttling, in
>>>>> case the latter is activated on different devices w.r.t. bfq.
>>>>> 
>>>>> In addition, the same may happen, in the near future, with the
>>>>> bandwidth controller Josef is working on.  If the controller can be
>>>>> configured per device, as with throttling, then statistics may differ,
>>>>> for the same interface files, between bfq, throttling and that
>>>>> controller.
>>> 
>>> So, regardless of how all these are implemented, what's presented to
>>> user should be consistent and clear.  There's no other way around it.
>>> Only what's relevant should be visible to userspace.
>>> 
>>>> have you had time to look into this?  Any improvement to this
>>>> interface is ok for us. We are only interested in finally solving this
>>>> interface issue, as, for what concerns us directly, it has been
>>>> preventing legacy code to use bfq for years.
>>> 
>>> Unfortunately, I don't have any implementation proposal, but we can't
>>> show things this way to userspace.
>>> 
>> 
>> Well, this is not very helpful to move forward :)
>> 
>> Let me try to repeat the problem, to try to help you help us unblock
>> the situation.
>> 
>> If we have multiple entities attached to the same interface output
>> file, you don't find it clear that each entity shows the number it
>> wants to show.  But you have no idea either of how that differentiated
>> information should be shown.  Is this the situation, or is the problem
>> somewhere 'above' this level?
>> 
>> If the problem is as I described it, here are some proposal attempts:
>> 1) Do you want file sharing to be allowed only if all entities will
>> output the same number?  (this seems excessive, but maybe it makes
>> sense)
>> 2) Do you want only one number to be shown, equal to the sum of the
>> numbers of each entity?  (in some cases, this may make sense)
>> 3) Do you prefer an average?
>> 4) Do you have any other idea, even if just germinal?
> 
> To further add to what Paolo said and better expose the problem, I'd like to
> say that all those proposals have issues.
> If we only allow "same output" cftypes to be shared then we lose all the
> flexibility of this solution, and we need a way for an entity to know other
> entities internal variables beforehand, which sounds at least very hard, and
> maybe is not even an acceptable thing to do.
> To put the average, sum or some other mathematical function in the file only
> makes sense for certain cftypes, so also doesn't sound like a good idea. In
> fact I can think of scenarios where only seeing the different values of the
> entities makes sense for a user.
> 
> I understand that the problem is inconsistency: having a file that behaves
> differently depending on the situation, and the only way to prevent this I can
> think of is to *always* show the entity owner of a certain file (or part of the
> output), even when the output would be the same among entities or when the
> file is not currently shared but could be. Can this be an acceptable solution?
> 
> Angelo
> 

Hi Jens, all,
let me push for this interface to be fixed too.  If we don't fix it in
some way, then from 4.21 we well end up with a ridiculous paradox: the
proportional share policy (weights) will of course be available, but
unusable in practice.  In fact, as Lennart--and not only Lennart--can
confirm, no piece of code uses bfq.weight to set weights, or will do
it.

A trivial solution would be to throw away all our work to fix this
issue by extending the interface, and just let bfq use the former cfq
names.  But then the same mess will happen as, e.g., Josef will
propose his proportional-share controller.

Before making this solution, we proposed it and waited for it to be
approved several months ago, so I hope that Tejun concern can be
addressed somehow.

If Tejun cannot see any solution to his concern, then can we just
switch to this extension, considering that
- for non-shared names the interface is *identical* to the current
  one;
- by using this new interface, and getting feedback we could
  understand how to better handle Tejun's concern?

A lot of systems do use weights, and people don't even know that these
systems don't work correctly in blk-mq.  And they won't work correctly
in any available configuration from 4.21, if we don't fix this problem.

Thanks.
Paolo

>> 
>> Looking forward to your feedback,
>> Paolo
>> 
>> 
>>> Thanks.
>>> 
>>> --
>>> tejun
>>> 
>>> --
>>> You received this message because you are subscribed to the Google Groups
>>> "bfq-iosched" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an
>>> email to bfq-iosched+unsubscribe@googlegroups.com.
>>> For more options, visit https://groups.google.com/d/optout.
Tejun Heo Dec. 18, 2018, 4:41 p.m. UTC | #8
Hello, Paolo.

On Tue, Dec 18, 2018 at 08:48:10AM +0100, Paolo Valente wrote:
> If Tejun cannot see any solution to his concern, then can we just
> switch to this extension, considering that
> - for non-shared names the interface is *identical* to the current
>   one;
> - by using this new interface, and getting feedback we could
>   understand how to better handle Tejun's concern?
> A lot of systems do use weights, and people don't even know that these
> systems don't work correctly in blk-mq.  And they won't work correctly
> in any available configuration from 4.21, if we don't fix this problem.

So, when seen from userland, how it should behave isn't vague or
complicated.  For a given device and policy type, there can be only
one implementation active.  It doesn't make sense to have two weight
mechanisms active on one device, right?  So, the interface should only
present what makes sense to the user for both configuration knobs and
statistics, and that'd be a hard requirement because we don't want to
present confusing spurious information to userspace.

There seemd to have been significant misunderstandings as to what the
requirements are when this was discussed way back, so idk what the
good path forward is at this point.  Just keep the current names?

Thanks.
Paolo Valente Dec. 18, 2018, 5:22 p.m. UTC | #9
> Il giorno 18 dic 2018, alle ore 17:41, Tejun Heo <tj@kernel.org> ha scritto:
> 
> Hello, Paolo.
> 
> On Tue, Dec 18, 2018 at 08:48:10AM +0100, Paolo Valente wrote:
>> If Tejun cannot see any solution to his concern, then can we just
>> switch to this extension, considering that
>> - for non-shared names the interface is *identical* to the current
>>  one;
>> - by using this new interface, and getting feedback we could
>>  understand how to better handle Tejun's concern?
>> A lot of systems do use weights, and people don't even know that these
>> systems don't work correctly in blk-mq.  And they won't work correctly
>> in any available configuration from 4.21, if we don't fix this problem.
> 
> So, when seen from userland, how it should behave isn't vague or
> complicated.  For a given device and policy type, there can be only
> one implementation active.

Yes, but the problem is the opposite. You may have
- two different policies, with the same interface parameter, 
- one active on one device
- the other one active on another device

In that case, statistics from one policy necessarily differ from
statistics from the other policy.

In this respect, in a system with more than one drive it already
happens that the same policy is active on different devices.  When
printing a statistics interface file for the policy, the output will
be a list of separate statistics, with a bunch of statistics *for
each* drive (plus a grand total in some cases).

So, our proposal simply extends this scheme in the most natural way:
if, now, also two or more policies share the same statistics file,
then the output will be a list of separate statistics, one for each
policy.  The statistics for each policy will be tagged with the policy
name, and will have the same identical form as above.  It seems the
most natural hierarchical extension of the same scheme.

At any rate, if you don't like it, just tell us how you prefer it
done.  Do you prefer the sharing of statistics file to be simply
forbidden?  (If this can be done.) I think such an incomplete solution
would preserve part of the current mess; but, if this allows us to
exit from this impasse, then it is ok for me.

*Any* feasible option is ok for me. Just pick one.

>  It doesn't make sense to have two weight
> mechanisms active on one device, right?

(Un)fortunately, the problem are not weights.  There won't be two
weights for two policies expiring a weight parameter.  The problems
concerns statistics.  See above.


>  So, the interface should only
> present what makes sense to the user for both configuration knobs and
> statistics, and that'd be a hard requirement because we don't want to
> present confusing spurious information to userspace.
> 
> There seemd to have been significant misunderstandings as to what the
> requirements are when this was discussed way back, so idk what the
> good path forward is at this point.  Just keep the current names?
> 

I don't clearly understand how "just picking the current names" is a
way forward, but if we do not make this extension, in a way or the
other, then two policies will simply not be allowed to share the same
interface files.  And we will be still at the starting point.

Thanks,
Paolo

> Thanks.
> 
> -- 
> tejun
Paolo Valente Dec. 18, 2018, 6:05 p.m. UTC | #10
> Il giorno 18 dic 2018, alle ore 18:22, Paolo Valente <paolo.valente@linaro.org> ha scritto:
> 
> 
> 
>> Il giorno 18 dic 2018, alle ore 17:41, Tejun Heo <tj@kernel.org> ha scritto:
>> 
>> Hello, Paolo.
>> 
>> On Tue, Dec 18, 2018 at 08:48:10AM +0100, Paolo Valente wrote:
>>> If Tejun cannot see any solution to his concern, then can we just
>>> switch to this extension, considering that
>>> - for non-shared names the interface is *identical* to the current
>>> one;
>>> - by using this new interface, and getting feedback we could
>>> understand how to better handle Tejun's concern?
>>> A lot of systems do use weights, and people don't even know that these
>>> systems don't work correctly in blk-mq.  And they won't work correctly
>>> in any available configuration from 4.21, if we don't fix this problem.
>> 
>> So, when seen from userland, how it should behave isn't vague or
>> complicated.  For a given device and policy type, there can be only
>> one implementation active.
> 
> Yes, but the problem is the opposite. You may have
> - two different policies, with the same interface parameter, 
> - one active on one device
> - the other one active on another device
> 
> In that case, statistics from one policy necessarily differ from
> statistics from the other policy.
> 
> In this respect, in a system with more than one drive it already
> happens that the same policy is active on different devices.  When
> printing a statistics interface file for the policy, the output will
> be a list of separate statistics, with a bunch of statistics *for
> each* drive (plus a grand total in some cases).
> 
> So, our proposal simply extends this scheme in the most natural way:
> if, now, also two or more policies share the same statistics file,
> then the output will be a list of separate statistics, one for each
> policy.  The statistics for each policy will be tagged with the policy
> name, and will have the same identical form as above.  It seems the
> most natural hierarchical extension of the same scheme.
> 
> At any rate, if you don't like it, just tell us how you prefer it
> done.  Do you prefer the sharing of statistics file to be simply
> forbidden?  (If this can be done.) I think such an incomplete solution
> would preserve part of the current mess; but, if this allows us to
> exit from this impasse, then it is ok for me.
> 
> *Any* feasible option is ok for me. Just pick one.
> 
>> It doesn't make sense to have two weight
>> mechanisms active on one device, right?
> 
> (Un)fortunately, the problem are not weights.  There won't be two
> weights for two policies expiring a weight parameter.  The problems

s/expiring/sharing sorry
Paolo Valente Dec. 23, 2018, 11 a.m. UTC | #11
> Il giorno 18 dic 2018, alle ore 18:22, Paolo Valente <paolo.valente@linaro.org> ha scritto:
> 
> 
> 
>> Il giorno 18 dic 2018, alle ore 17:41, Tejun Heo <tj@kernel.org> ha scritto:
>> 
>> Hello, Paolo.
>> 
>> On Tue, Dec 18, 2018 at 08:48:10AM +0100, Paolo Valente wrote:
>>> If Tejun cannot see any solution to his concern, then can we just
>>> switch to this extension, considering that
>>> - for non-shared names the interface is *identical* to the current
>>> one;
>>> - by using this new interface, and getting feedback we could
>>> understand how to better handle Tejun's concern?
>>> A lot of systems do use weights, and people don't even know that these
>>> systems don't work correctly in blk-mq.  And they won't work correctly
>>> in any available configuration from 4.21, if we don't fix this problem.
>> 
>> So, when seen from userland, how it should behave isn't vague or
>> complicated.  For a given device and policy type, there can be only
>> one implementation active.
> 
> Yes, but the problem is the opposite. You may have
> - two different policies, with the same interface parameter, 
> - one active on one device
> - the other one active on another device
> 
> In that case, statistics from one policy necessarily differ from
> statistics from the other policy.
> 
> In this respect, in a system with more than one drive it already
> happens that the same policy is active on different devices.  When
> printing a statistics interface file for the policy, the output will
> be a list of separate statistics, with a bunch of statistics *for
> each* drive (plus a grand total in some cases).
> 
> So, our proposal simply extends this scheme in the most natural way:
> if, now, also two or more policies share the same statistics file,
> then the output will be a list of separate statistics, one for each
> policy.  The statistics for each policy will be tagged with the policy
> name, and will have the same identical form as above.  It seems the
> most natural hierarchical extension of the same scheme.
> 

Maybe my generic description didn't highlight how plain are.

If you print, e.g., io_serviced with the current interface, you get

--------------------------

8:0 Read 4291168
8:0 Write 2181577
8:0 Sync 5897755
8:0 Async 574990
8:0 Total 6472745
Total 6472745

--------------------------

With the new, interface, you get *the same output*, if only one policy
is attached to this interface file.  In, instead
- two policies share
the the file, because one is active on a device and one on another
device
- these policies are named, e.g., bfq and pol2
then you get (device number and statistics invented):

--------------------------

bfq:
8:0 Read 4291168
8:0 Write 2181577
8:0 Sync 5897755
8:0 Async 574990
8:0 Total 6472745
Total 6472745

pol2:
16:0 Read 238768
16:0 Write 323123
16:0 Sync 43243
16:0 Async 432432
16:0 Total 412435
Total 4341244

--------------------------

So you see the per-device statistics as before, without the problem
of inventing a new set of names for every new policy that has the same
interface files of an existing policy.

Tejun, let's try to converge, to whatever solution you prefer.


4.21 is coming ...  and the legacy proportional share interface will
be gone with cfq.  This will break legacy code using the
proportional-share interface on top of bfq.  This code may just fail
when trying to use interface files that don't exist any longer.

Thanks,
Paolo

> At any rate, if you don't like it, just tell us how you prefer it
> done.  Do you prefer the sharing of statistics file to be simply
> forbidden?  (If this can be done.) I think such an incomplete solution
> would preserve part of the current mess; but, if this allows us to
> exit from this impasse, then it is ok for me.
> 
> *Any* feasible option is ok for me. Just pick one.
> 
>> It doesn't make sense to have two weight
>> mechanisms active on one device, right?
> 
> (Un)fortunately, the problem are not weights.  There won't be two
> weights for two policies expiring a weight parameter.  The problems
> concerns statistics.  See above.
> 
> 
>> So, the interface should only
>> present what makes sense to the user for both configuration knobs and
>> statistics, and that'd be a hard requirement because we don't want to
>> present confusing spurious information to userspace.
>> 
>> There seemd to have been significant misunderstandings as to what the
>> requirements are when this was discussed way back, so idk what the
>> good path forward is at this point.  Just keep the current names?
>> 
> 
> I don't clearly understand how "just picking the current names" is a
> way forward, but if we do not make this extension, in a way or the
> other, then two policies will simply not be allowed to share the same
> interface files.  And we will be still at the starting point.
> 
> Thanks,
> Paolo
> 
>> Thanks.
>> 
>> -- 
>> tejun
> 
> -- 
> You received this message because you are subscribed to the Google Groups "bfq-iosched" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to bfq-iosched+unsubscribe@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
Tejun Heo Dec. 27, 2018, 11:41 p.m. UTC | #12
Hello, Paolo.

On Sun, Dec 23, 2018 at 12:00:14PM +0100, Paolo Valente wrote:
> 4.21 is coming ...  and the legacy proportional share interface will
> be gone with cfq.  This will break legacy code using the
> proportional-share interface on top of bfq.  This code may just fail
> when trying to use interface files that don't exist any longer.

Sounds like inheriting .cfq namespace would be the easiest.  Would
that work?

Thanks.
Paolo Valente Dec. 30, 2018, 10:25 a.m. UTC | #13
> Il giorno 28 dic 2018, alle ore 00:41, Tejun Heo <tj@kernel.org> ha scritto:
> 
> Hello, Paolo.
> 
> On Sun, Dec 23, 2018 at 12:00:14PM +0100, Paolo Valente wrote:
>> 4.21 is coming ...  and the legacy proportional share interface will
>> be gone with cfq.  This will break legacy code using the
>> proportional-share interface on top of bfq.  This code may just fail
>> when trying to use interface files that don't exist any longer.
> 
> Sounds like inheriting .cfq namespace would be the easiest.  Would
> that work?
> 

For bfq, yes, but what will, e.g., Josef do when he adds his new
proportional-share implementation?  Will he add a new set of names not
used by any legacy piece of code?

What's the benefit of throwing away months of work, on which we agreed
before starting it, and that solves a problem already acknowledged by
interested parties?

Thanks,
Paolo

> Thanks.
> 
> -- 
> tejun
Tejun Heo Jan. 2, 2019, 4:03 p.m. UTC | #14
Hello, Paolo.

On Sun, Dec 30, 2018 at 11:25:25AM +0100, Paolo Valente wrote:
> What's the benefit of throwing away months of work, on which we agreed
> before starting it, and that solves a problem already acknowledged by
> interested parties?

Showing multiple conflicting numbers definitely isn't anything which
is agreed upon.

Thanks.
Paolo Valente Jan. 2, 2019, 4:28 p.m. UTC | #15
> Il giorno 2 gen 2019, alle ore 17:03, Tejun Heo <tj@kernel.org> ha scritto:
> 
> Hello, Paolo.
> 
> On Sun, Dec 30, 2018 at 11:25:25AM +0100, Paolo Valente wrote:
>> What's the benefit of throwing away months of work, on which we agreed
>> before starting it, and that solves a problem already acknowledged by
>> interested parties?
> 
> Showing multiple conflicting numbers definitely isn't anything which
> is agreed upon.
> 

Sorry, of course you din't realize that sharing interface files had
this consequence, otherwise you'd have protested beforehand.

The problem is that this consequence seems unavoidable: if two
policies have different numbers to convey, through a shared interface
file, then they must be allowed to write their different numbers.  To
me, this doesn't sound like a problem.

The only other natural option is no unification, unless you have a
third way.

What do you prefer, or propose?

Thanks,
Paolo

> Thanks.
> 
> -- 
> tejun