diff mbox

[v2] pci/switchtec: Don't use completion's wait queue

Message ID 20171005105100.2nf4ih2fgq6dax5g@linutronix.de (mailing list archive)
State New, archived
Delegated to: Bjorn Helgaas
Headers show

Commit Message

Sebastian Andrzej Siewior Oct. 5, 2017, 10:51 a.m. UTC
The poll callback is using completion's wait_queue_head_t member and
puts it in poll_wait() so the poll() caller gets a wakeup after command
completed. This does not work on RT because we don't have a
wait_queue_head_t in our completion implementation. Nobody in tree does
like that in tree so this is the only driver that breaks.

Instead of using the completion here is waitqueue with a status flag as
suggested by Logan.

I don't have the HW so I have no idea if it works as expected, so please
test it.

Cc: Kurt Schwemmer <kurt.schwemmer@microsemi.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
On 2017-10-04 10:13:15 [-0600], Logan Gunthorpe wrote:

> But at the very least, if we _absolutely_ have to patch this out, you
> shouldn't make it so convoluted. Just replace the struct completion with a
> wait queue and a done flag (which is what a completion is). Don't move the
> wait queue into switchtec_dev and don't use an unnecessary counter. And
> certainly don't think about merging it with another wait queue that has
> completely different wake events and waiters.

here it is.

v1…v2: use a status flag and a wait_queue_head_t instead as suggested by
       Logan.

 drivers/pci/switch/switchtec.c | 21 ++++++++++++---------
 1 file changed, 12 insertions(+), 9 deletions(-)

Comments

Logan Gunthorpe Oct. 5, 2017, 6:46 p.m. UTC | #1
I still don't want this merged as it makes the code less clear, but I 
tested it and it does not work.

On 05/10/17 04:51 AM, Sebastian Andrzej Siewior wrote:

> @@ -451,7 +452,7 @@ static void mrpc_complete_cmd(struct switchtec_dev *stdev)
>   		      stuser->read_len);
>   
>   out:
> -	complete_all(&stuser->comp);
> +	wake_up_interruptible(&stuser->cmd_comp);

I think you forgot to set cmd_done here. Once I add that, it did start 
working correctly.

>   	list_del_init(&stuser->list);
>   	stuser_put(stuser);
>   	stdev->mrpc_busy = 0;
> @@ -721,10 +722,11 @@ static ssize_t switchtec_dev_read(struct file *filp, char __user *data,
>   	mutex_unlock(&stdev->mrpc_mutex);
>   
>   	if (filp->f_flags & O_NONBLOCK) {
> -		if (!try_wait_for_completion(&stuser->comp))
> +		if (!stuser->cmd_done)
>   			return -EAGAIN;

I'd probably also suggest using READ_ONCE when reading cmd_done.

Logan
Sebastian Andrzej Siewior Nov. 2, 2017, 3:07 p.m. UTC | #2
On 2017-10-05 12:46:34 [-0600], Logan Gunthorpe wrote:
> I still don't want this merged as it makes the code less clear, but I tested
> it and it does not work.
> 
> On 05/10/17 04:51 AM, Sebastian Andrzej Siewior wrote:
> 
> > @@ -451,7 +452,7 @@ static void mrpc_complete_cmd(struct switchtec_dev *stdev)
> >   		      stuser->read_len);
> >   out:
> > -	complete_all(&stuser->comp);
> > +	wake_up_interruptible(&stuser->cmd_comp);
> 
> I think you forgot to set cmd_done here. Once I add that, it did start
> working correctly.

oh, makes sense.

> >   	list_del_init(&stuser->list);
> >   	stuser_put(stuser);
> >   	stdev->mrpc_busy = 0;
> > @@ -721,10 +722,11 @@ static ssize_t switchtec_dev_read(struct file *filp, char __user *data,
> >   	mutex_unlock(&stdev->mrpc_mutex);
> >   	if (filp->f_flags & O_NONBLOCK) {
> > -		if (!try_wait_for_completion(&stuser->comp))
> > +		if (!stuser->cmd_done)
> >   			return -EAGAIN;
> 
> I'd probably also suggest using READ_ONCE when reading cmd_done.

Why so? There is no way the compiler could have cached the value
somehow.

> Logan

Sebastian
Logan Gunthorpe Nov. 2, 2017, 4:03 p.m. UTC | #3
On 11/2/2017 9:07 AM, Sebastian Andrzej Siewior wrote:
> Why so? There is no way the compiler could have cached the value
> somehow.

Well, I think it would be sufficient to notice that the code you were 
replacing used a READ_ONCE and you're putting it into a function that's 
a bit more complicated. But I'd suggest you read [1] which essentially 
says that all shared variables should be accessed with such a macro even 
if only to document that it's a part of a thread synchronization protocol.

Logan


[1] https://github.com/google/ktsan/wiki/READ_ONCE-and-WRITE_ONCE
Sebastian Andrzej Siewior Nov. 2, 2017, 5:17 p.m. UTC | #4
On 2017-11-02 10:03:53 [-0600], Logan Gunthorpe wrote:
> On 11/2/2017 9:07 AM, Sebastian Andrzej Siewior wrote:
> > Why so? There is no way the compiler could have cached the value
> > somehow.
> 
> Well, I think it would be sufficient to notice that the code you were
> replacing used a READ_ONCE and you're putting it into a function that's a
> bit more complicated. But I'd suggest you read [1] which essentially says
> that all shared variables should be accessed with such a macro even if only
> to document that it's a part of a thread synchronization protocol.

It used READ_ONCE in case the code would have been inlined within a loop
which makes sense to some degree. It is not the case here. But hey I can
add, no problem…

> Logan

Sebastian
Logan Gunthorpe Nov. 2, 2017, 5:23 p.m. UTC | #5
On 11/2/2017 11:17 AM, Sebastian Andrzej Siewior wrote:
>> Well, I think it would be sufficient to notice that the code you were
>> replacing used a READ_ONCE and you're putting it into a function that's a
>> bit more complicated. But I'd suggest you read [1] which essentially says
>> that all shared variables should be accessed with such a macro even if only
>> to document that it's a part of a thread synchronization protocol.
> 
> It used READ_ONCE in case the code would have been inlined within a loop
> which makes sense to some degree. It is not the case here. But hey I can
> add, no problem…

Well, you clearly didn't read the article I sent or even what I said. 
But your reasoning is a bit flawed seeing try_wait_for_completion() is 
not designed to be inlined and will never be inlined unless the kernel 
is compiled with link-time optimization which I don't think many people use.

Logan
Sebastian Andrzej Siewior Nov. 2, 2017, 6:05 p.m. UTC | #6
On 2017-11-02 11:23:16 [-0600], Logan Gunthorpe wrote:
> 
> Well, you clearly didn't read the article I sent or even what I said. But

I did, I just didn't manage to make the link.
The futex example looks like compiler optimisation which should not have
happen and would be ideal for READ_ONCE, yes. But this is not the case
here and this is what I pointed out.
And the ps2 example, well it did not check condition after the wait so
it was not obvious if the condition (for which it was waiting) was true or not.
So if you want this thing to be catched by tools, I don't know how this bug

|	int rc = -1;
|	...
|	wait_event_timeout(ps2dev->wait, !(READ_ONCE(ps2dev->flags) & PS2_FLAG_CMD), timeout);
|	for (i = 0; i < receive; i++)
|		resp[i] = ps2dev->cmdbuf[(receive - 1) - i];
|	if (READ_ONCE(ps2dev->cmdcnt))
|		goto out;
|	rc = 0;
|out:
|	return rc;

could have spotted by anything but a human.

> your reasoning is a bit flawed seeing try_wait_for_completion() is not
> designed to be inlined and will never be inlined unless the kernel is
> compiled with link-time optimization which I don't think many people use.

See 7c34e3180a01 ("sched/completion: Add lock-free checking of the
blocking case")

> Logan

Sebastian
Logan Gunthorpe Nov. 2, 2017, 8:02 p.m. UTC | #7
On 02/11/17 12:05 PM, Sebastian Andrzej Siewior wrote:
> On 2017-11-02 11:23:16 [-0600], Logan Gunthorpe wrote:
>>
>> Well, you clearly didn't read the article I sent or even what I said. But
> 
> I did, I just didn't manage to make the link.
> The futex example looks like compiler optimisation which should not have
> happen and would be ideal for READ_ONCE, yes. But this is not the case
> here and this is what I pointed out.

And you still missed what I said and the entire point of the article.

> See 7c34e3180a01 ("sched/completion: Add lock-free checking of the
> blocking case")

Sounds questionable. I don't see how that function could be inlined, 
even then.

Logan
Sebastian Andrzej Siewior Nov. 2, 2017, 8:53 p.m. UTC | #8
On 2017-11-02 14:02:23 [-0600], Logan Gunthorpe wrote:
> And you still missed what I said and the entire point of the article.

I'm sorry.

> > See 7c34e3180a01 ("sched/completion: Add lock-free checking of the
> > blocking case")
> 
> Sounds questionable. I don't see how that function could be inlined, even
> then.

It would have made sense if that function would be used also within the
file where it was declared but it is not the case.
So you see the questionable way the READ_ONCE got in there and now you
may understand that I did not see much of big loss if it wasn't there
after the rewrite/change.

> Logan

Sebastian
Logan Gunthorpe Nov. 2, 2017, 8:55 p.m. UTC | #9
On 02/11/17 02:53 PM, Sebastian Andrzej Siewior wrote:
> It would have made sense if that function would be used also within the
> file where it was declared but it is not the case.
> So you see the questionable way the READ_ONCE got in there and now you
> may understand that I did not see much of big loss if it wasn't there
> after the rewrite/change.

Yes, but I feel like it still makes sense for it to be there (per the 
article I linked), despite the incorrect justification.

Logan
diff mbox

Patch

diff --git a/drivers/pci/switch/switchtec.c b/drivers/pci/switch/switchtec.c
index af81b2dec42e..4c843fca709a 100644
--- a/drivers/pci/switch/switchtec.c
+++ b/drivers/pci/switch/switchtec.c
@@ -306,10 +306,11 @@  struct switchtec_user {
 
 	enum mrpc_state state;
 
-	struct completion comp;
+	wait_queue_head_t cmd_comp;
 	struct kref kref;
 	struct list_head list;
 
+	bool cmd_done;
 	u32 cmd;
 	u32 status;
 	u32 return_code;
@@ -331,7 +332,7 @@  static struct switchtec_user *stuser_create(struct switchtec_dev *stdev)
 	stuser->stdev = stdev;
 	kref_init(&stuser->kref);
 	INIT_LIST_HEAD(&stuser->list);
-	init_completion(&stuser->comp);
+	init_waitqueue_head(&stuser->cmd_comp);
 	stuser->event_cnt = atomic_read(&stdev->event_cnt);
 
 	dev_dbg(&stdev->dev, "%s: %p\n", __func__, stuser);
@@ -414,7 +415,7 @@  static int mrpc_queue_cmd(struct switchtec_user *stuser)
 	kref_get(&stuser->kref);
 	stuser->read_len = sizeof(stuser->data);
 	stuser_set_state(stuser, MRPC_QUEUED);
-	init_completion(&stuser->comp);
+	stuser->cmd_done = false;
 	list_add_tail(&stuser->list, &stdev->mrpc_queue);
 
 	mrpc_cmd_submit(stdev);
@@ -451,7 +452,7 @@  static void mrpc_complete_cmd(struct switchtec_dev *stdev)
 		      stuser->read_len);
 
 out:
-	complete_all(&stuser->comp);
+	wake_up_interruptible(&stuser->cmd_comp);
 	list_del_init(&stuser->list);
 	stuser_put(stuser);
 	stdev->mrpc_busy = 0;
@@ -721,10 +722,11 @@  static ssize_t switchtec_dev_read(struct file *filp, char __user *data,
 	mutex_unlock(&stdev->mrpc_mutex);
 
 	if (filp->f_flags & O_NONBLOCK) {
-		if (!try_wait_for_completion(&stuser->comp))
+		if (!stuser->cmd_done)
 			return -EAGAIN;
 	} else {
-		rc = wait_for_completion_interruptible(&stuser->comp);
+		rc = wait_event_interruptible(stuser->cmd_comp,
+					      stuser->cmd_done);
 		if (rc < 0)
 			return rc;
 	}
@@ -772,7 +774,7 @@  static unsigned int switchtec_dev_poll(struct file *filp, poll_table *wait)
 	struct switchtec_dev *stdev = stuser->stdev;
 	int ret = 0;
 
-	poll_wait(filp, &stuser->comp.wait, wait);
+	poll_wait(filp, &stuser->cmd_comp, wait);
 	poll_wait(filp, &stdev->event_wq, wait);
 
 	if (lock_mutex_and_test_alive(stdev))
@@ -780,7 +782,7 @@  static unsigned int switchtec_dev_poll(struct file *filp, poll_table *wait)
 
 	mutex_unlock(&stdev->mrpc_mutex);
 
-	if (try_wait_for_completion(&stuser->comp))
+	if (stuser->cmd_done)
 		ret |= POLLIN | POLLRDNORM;
 
 	if (stuser->event_cnt != atomic_read(&stdev->event_cnt))
@@ -1255,7 +1257,8 @@  static void stdev_kill(struct switchtec_dev *stdev)
 
 	/* Wake up and kill any users waiting on an MRPC request */
 	list_for_each_entry_safe(stuser, tmpuser, &stdev->mrpc_queue, list) {
-		complete_all(&stuser->comp);
+		stuser->cmd_done = true;
+		wake_up_interruptible(&stuser->cmd_comp);
 		list_del_init(&stuser->list);
 		stuser_put(stuser);
 	}