diff mbox

[for-next,V2,8/9] IB/mlx4: Support extended create_cq and query_device uverbs

Message ID 1433074457-26437-9-git-send-email-ogerlitz@mellanox.com (mailing list archive)
State Superseded
Headers show

Commit Message

Or Gerlitz May 31, 2015, 12:14 p.m. UTC
From: Matan Barak <matanb@mellanox.com>

Add support for ib_uverbs_ex_create_cq and ib_uverbs_ex_query_device
by setting the appropriate bit in uverbs_ex_cmd_mask.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
---
 drivers/infiniband/hw/mlx4/main.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

Comments

Jason Gunthorpe June 1, 2015, 4:56 p.m. UTC | #1
On Sun, May 31, 2015 at 03:14:16PM +0300, Or Gerlitz wrote:
> From: Matan Barak <matanb@mellanox.com>
> 
> Add support for ib_uverbs_ex_create_cq and ib_uverbs_ex_query_device
> by setting the appropriate bit in uverbs_ex_cmd_mask.

Why is this a seperate patch? Surely the bits should be or'd in the patches
that actually include the code to do the new commands?

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Matan Barak June 2, 2015, 7:21 a.m. UTC | #2
On Mon, Jun 1, 2015 at 7:56 PM, Jason Gunthorpe
<jgunthorpe@obsidianresearch.com> wrote:
> On Sun, May 31, 2015 at 03:14:16PM +0300, Or Gerlitz wrote:
>> From: Matan Barak <matanb@mellanox.com>
>>
>> Add support for ib_uverbs_ex_create_cq and ib_uverbs_ex_query_device
>> by setting the appropriate bit in uverbs_ex_cmd_mask.
>
> Why is this a seperate patch? Surely the bits should be or'd in the patches
> that actually include the code to do the new commands?
>

Although this patch can stand on its own (requesting unsupported flags
will just fail and the
comp_mask of uhw's will be disabled), there's no real value for not
squashing it into the other mlx4 patches.
We'll do that for the next version.

> Jason

Thanks for the review.

Matan

> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe June 2, 2015, 5:07 p.m. UTC | #3
On Tue, Jun 02, 2015 at 10:21:47AM +0300, Matan Barak wrote:
> On Mon, Jun 1, 2015 at 7:56 PM, Jason Gunthorpe
> <jgunthorpe@obsidianresearch.com> wrote:
> > On Sun, May 31, 2015 at 03:14:16PM +0300, Or Gerlitz wrote:
> >> From: Matan Barak <matanb@mellanox.com>
> >>
> >> Add support for ib_uverbs_ex_create_cq and ib_uverbs_ex_query_device
> >> by setting the appropriate bit in uverbs_ex_cmd_mask.
> >
> > Why is this a seperate patch? Surely the bits should be or'd in the patches
> > that actually include the code to do the new commands?
> >
> 
> Although this patch can stand on its own (requesting unsupported flags
> will just fail and the
> comp_mask of uhw's will be disabled), there's no real value for not
> squashing it into the other mlx4 patches.

Then there is something wrong with this 'uverbs_ex_cmd_mask' stuff -
the bit should be clear if the driver cannot handle the ex inputs, but
these patches make all drivers handle the ex style (by checking
flags), so all drivers should have some of the bits set..

An extended command that doesn't use any extended features should
transparently degrade to the normal command as often as possible.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Matan Barak June 3, 2015, 8:57 a.m. UTC | #4
On Tue, Jun 2, 2015 at 8:07 PM, Jason Gunthorpe
<jgunthorpe@obsidianresearch.com> wrote:
> On Tue, Jun 02, 2015 at 10:21:47AM +0300, Matan Barak wrote:
>> On Mon, Jun 1, 2015 at 7:56 PM, Jason Gunthorpe
>> <jgunthorpe@obsidianresearch.com> wrote:
>> > On Sun, May 31, 2015 at 03:14:16PM +0300, Or Gerlitz wrote:
>> >> From: Matan Barak <matanb@mellanox.com>
>> >>
>> >> Add support for ib_uverbs_ex_create_cq and ib_uverbs_ex_query_device
>> >> by setting the appropriate bit in uverbs_ex_cmd_mask.
>> >
>> > Why is this a seperate patch? Surely the bits should be or'd in the patches
>> > that actually include the code to do the new commands?
>> >
>>
>> Although this patch can stand on its own (requesting unsupported flags
>> will just fail and the
>> comp_mask of uhw's will be disabled), there's no real value for not
>> squashing it into the other mlx4 patches.
>
> Then there is something wrong with this 'uverbs_ex_cmd_mask' stuff -
> the bit should be clear if the driver cannot handle the ex inputs, but
> these patches make all drivers handle the ex style (by checking
> flags), so all drivers should have some of the bits set..
>

That's a general comment regarding the extension mechanism.
Since by nature the extended verbs as extendible, one consumer could support
A and B while the other only supports A, but they both indicate they support
this extension verb.
You could argue that if the "flags" field wasn't tested, we would have need this
uverbs_ex_cmd_mask - but because it could be also used by kernel consumers,
this check is necessary.

> An extended command that doesn't use any extended features should
> transparently degrade to the normal command as often as possible.
>

That means that uverbs_ex_cmd_mask should only be used on extended commands that
are user-space specific.
Anyway, we could add these  IB_USER_VERBS_EX_CMD flags to all vendors,
but IMHO this general problem doesn't relate to this series,
which is only about adding timestamp support.

> Jason

Thanks for your comments.

Matan
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe June 3, 2015, 4:31 p.m. UTC | #5
On Wed, Jun 03, 2015 at 11:57:12AM +0300, Matan Barak wrote:
> That's a general comment regarding the extension mechanism.

Yes, but it is also a specific comment about patch #4 which adds,
ib_uverbs_ex_create_cq.

Based on the implementation of create_cq, it is pretty clear that
every driver supports ib_uverbs_ex_create_cq, so patch #4 should just
force the flag in the device register function so it is globally
enabled.

query_device looks like it is the same, passing in the original
structure length will always work on any device. So Mellanox should
send a bugfix patch for that as well, unrelated to this series.

> > An extended command that doesn't use any extended features should
> > transparently degrade to the normal command as often as possible.
> 
> That means that uverbs_ex_cmd_mask should only be used on extended commands that
> are user-space specific.

The bit should only be clear on commands that can never return
anything but ENOSYS. Ie the kernel has no support for the command at
all, or for some reason the driver cannot handle the call.

The latter case should be rare, it would be the case if we added a new
command that was not extending an existing command, for instance.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Or Gerlitz June 3, 2015, 6:58 p.m. UTC | #6
On Wed, Jun 3, 2015 at 7:31 PM, Jason Gunthorpe
<jgunthorpe@obsidianresearch.com> wrote:
> On Wed, Jun 03, 2015 at 11:57:12AM +0300, Matan Barak wrote:
>> That's a general comment regarding the extension mechanism.
>
> Yes, but it is also a specific comment about patch #4 which adds,
> ib_uverbs_ex_create_cq.
>
> Based on the implementation of create_cq, it is pretty clear that
> every driver supports ib_uverbs_ex_create_cq, so patch #4 should just
> force the flag in the device register function so it is globally enabled.

But the other drivers currently do not support any CQ creation flag
and hence no extended functionality, I don't see the point signaling
towards user-space that the verb is supported, please elaborate.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe June 3, 2015, 7:16 p.m. UTC | #7
On Wed, Jun 03, 2015 at 09:58:25PM +0300, Or Gerlitz wrote:
> On Wed, Jun 3, 2015 at 7:31 PM, Jason Gunthorpe
> <jgunthorpe@obsidianresearch.com> wrote:
> > On Wed, Jun 03, 2015 at 11:57:12AM +0300, Matan Barak wrote:
> >> That's a general comment regarding the extension mechanism.
> >
> > Yes, but it is also a specific comment about patch #4 which adds,
> > ib_uverbs_ex_create_cq.
> >
> > Based on the implementation of create_cq, it is pretty clear that
> > every driver supports ib_uverbs_ex_create_cq, so patch #4 should just
> > force the flag in the device register function so it is globally enabled.
> 
> But the other drivers currently do not support any CQ creation flag
> and hence no extended functionality, I don't see the point signaling
> towards user-space that the verb is supported, please elaborate.

They support the base functionality, the flags = 0 case.

There is no reason to block access to the base functionality via the
extended api. That just creates hassles for userspace.

If userspace detects the extended API is present, it can just
switch unconditionally all usage to that API. This is how most new
kernel syscalls are introduced (glibc does this transparently).

Detecting what flags a driver supports (if any) is any entirely
different and orthogonal issue to introducing comp_mask/etc.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Or Gerlitz June 3, 2015, 7:35 p.m. UTC | #8
On Wed, Jun 3, 2015 at 10:16 PM, Jason Gunthorpe
<jgunthorpe@obsidianresearch.com> wrote:
> On Wed, Jun 03, 2015 at 09:58:25PM +0300, Or Gerlitz wrote:
>> On Wed, Jun 3, 2015 at 7:31 PM, Jason Gunthorpe
>> <jgunthorpe@obsidianresearch.com> wrote:
>> > On Wed, Jun 03, 2015 at 11:57:12AM +0300, Matan Barak wrote:
>> >> That's a general comment regarding the extension mechanism.
>> >
>> > Yes, but it is also a specific comment about patch #4 which adds,
>> > ib_uverbs_ex_create_cq.
>> >
>> > Based on the implementation of create_cq, it is pretty clear that
>> > every driver supports ib_uverbs_ex_create_cq, so patch #4 should just
>> > force the flag in the device register function so it is globally enabled.

>> But the other drivers currently do not support any CQ creation flag
>> and hence no extended functionality, I don't see the point signaling
>> towards user-space that the verb is supported, please elaborate.

> They support the base functionality, the flags = 0 case.

which doesn't let consumers to use any new functionality.

> There is no reason to block access to the base functionality via the
> extended api. That just creates hassles for userspace.
> If userspace detects the extended API is present, it can just
> switch unconditionally all usage to that API.

This is user-space run time story, they don't have the knowledge that
all the LL drivers supports the extended api for CQ creation. We had
to check the flag and in all LL drivers since the in-kernel IB stack
has no (and need not to have any) notion of extended calls.

> This is how most new kernel syscalls are introduced (glibc
> does this transparently).

That's an interesting comment. And you know what, basically we can add
auto support for that call in uverbs.

But the point here is a bit different: I somehow have the feeling that
unless ~each and every one of your review comments are accepted to the
letter, no inclusion.

You are not the maintainer here, and even maintainers prefer not to
force each of their detailed comments on submitters.

This specific comment relates TINY in-kernel thing that can be changed later.

If from ten comments you give me I accept as is five, with the other
five I am trying to argue, on two of them we agree to my side, on two
we go your side and on the last one we let the maintainer to cut, this
is a healthy process that makes sense.

Currently it's feels like of either accepting 98% of the comments you
give or no acceptance.

> Detecting what flags a driver supports (if any) is any entirely
> different and orthogonal issue to introducing comp_mask/etc.

I didn't say that the which flags are supported detection relates to
exposing that extended uverbs call. I don't understand the "is any
entirely different" part of the sentence, is that as of me being
EMS-er?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe June 3, 2015, 8:38 p.m. UTC | #9
On Wed, Jun 03, 2015 at 10:35:03PM +0300, Or Gerlitz wrote:

> > They support the base functionality, the flags = 0 case.
> 
> which doesn't let consumers to use any new functionality.

So what? A call with flags = 0 works, why return ENOSYS for all
drivers except mlx4 in that case? It doesn't make sense to be
asymmetric like that.

Again, the extension process (patch #4) was to introduce the flags, as
long as the flags is processed properly then the syscall is
functional and should not return ENOSYS. It does not matter which
flags, if any, are supported.

> But the point here is a bit different: I somehow have the feeling that
> unless ~each and every one of your review comments are accepted to the
> letter, no inclusion.

I am just reviewing, Doug will have to decide if discussion is done or
not. To be clear: 'no inclusion' from me would be a clear NAK
statement.

If I'm going to provide my Reviewed-By I want to see:
 1) Comments addressed via a code change
 2) Comments addressed via a persuasive technical argument
 3) Comments addressed as 'too much work'/'un-important'/'personal preference'/etc.
 4) Comments addressed because I am wrong

And try to be clear about it, explain clearly.

> You are not the maintainer here, and even maintainers prefer not to
> force each of their detailed comments on submitters.

This isn't a detailed comment, this is a significant point about how a
UAPI is expected to work. And yes, UAPI is important, details are
important and I will argue for my viewpoint.

There is a huge difference between doing work on your own drivers and
doing core work. I do not know many cases where a maintainer/reviewer
of core sections will let details slide. There is a high expectation
for core code, and a very high expectation for UAPI.

> This specific comment relates TINY in-kernel thing that can be
> changed later.

Where is the pride? Do it right!

> If from ten comments you give me I accept as is five, with the other
> five I am trying to argue, on two of them we agree to my side, on two
> we go your side and on the last one we let the maintainer to cut, this
> is a healthy process that makes sense.

Sure, but you have to make a persuasive technical argument.. You can't
just argue..

In this case, you completely skipped over my main point:
 Drivers that only support flags == 0 should not return ENOSYS.

I gave several reasons why I think this is important, and how
userspace can use this, and how it is normal in the kernel.

You responded to the reasons, but ignored the actual thesis, and
didn't provide any counter reasons to support your idea:
 Drivers that only support flags == 0 should return ENOSYS.

So we are not debating, we are just arguing, and it isn't productive.

> I don't understand the "is any entirely different" part of the
> sentence, is that as of me being EMS-er?

No, that is just me typoing 'an -> any'. Sorry

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
index 3992349..832d571 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -2323,6 +2323,10 @@  static void *mlx4_ib_add(struct mlx4_dev *dev)
 			(1ull << IB_USER_VERBS_EX_CMD_DESTROY_FLOW);
 	}
 
+	ibdev->ib_dev.uverbs_ex_cmd_mask |=
+		(1ull << IB_USER_VERBS_EX_CMD_QUERY_DEVICE) |
+		(1ull << IB_USER_VERBS_EX_CMD_CREATE_CQ);
+
 	mlx4_ib_alloc_eqs(dev, ibdev);
 
 	spin_lock_init(&iboe->lock);