diff mbox

[2/2] virtio-rng: fix stuck in catting hwrng attributes

Message ID 1410340027-15373-3-git-send-email-akong@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Amos Kong Sept. 10, 2014, 9:07 a.m. UTC
When I check hwrng attributes in sysfs, cat process always gets
stuck if guest has only 1 vcpu and uses a slow rng backend.

Currently we check if there is any tasks waiting to be run on
current cpu in rng_dev_read() by need_resched(). But need_resched()
doesn't work because rng_dev_read() is executing in user context.

This patch removed need_resched() and increase delay to 10 jiffies,
then other tasks can have chance to execute protected code.
Delaying 1 jiffy also works, but 10 jiffies is safer.

Signed-off-by: Amos Kong <akong@redhat.com>
---
 drivers/char/hw_random/core.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Comments

Amit Shah Sept. 11, 2014, 6:08 a.m. UTC | #1
On (Wed) 10 Sep 2014 [17:07:07], Amos Kong wrote:
> When I check hwrng attributes in sysfs, cat process always gets
> stuck if guest has only 1 vcpu and uses a slow rng backend.
> 
> Currently we check if there is any tasks waiting to be run on
> current cpu in rng_dev_read() by need_resched(). But need_resched()
> doesn't work because rng_dev_read() is executing in user context.
> 
> This patch removed need_resched() and increase delay to 10 jiffies,
> then other tasks can have chance to execute protected code.
> Delaying 1 jiffy also works, but 10 jiffies is safer.

I'd prefer two patches for this one: one to remove the need_resched()
check, and the other to increase the timeout.

Anyway,

Reviewed-by: Amit Shah <amit.shah@redhat.com>

> 
> Signed-off-by: Amos Kong <akong@redhat.com>
> ---
>  drivers/char/hw_random/core.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c
> index c591d7e..b5d1b6f 100644
> --- a/drivers/char/hw_random/core.c
> +++ b/drivers/char/hw_random/core.c
> @@ -195,8 +195,7 @@ static ssize_t rng_dev_read(struct file *filp, char __user *buf,
>  
>  		mutex_unlock(&rng_mutex);
>  
> -		if (need_resched())
> -			schedule_timeout_interruptible(1);
> +		schedule_timeout_interruptible(10);
>  
>  		if (signal_pending(current)) {
>  			err = -ERESTARTSYS;
> -- 
> 1.9.3
> 

		Amit
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rusty Russell Sept. 11, 2014, 11:38 a.m. UTC | #2
Amos Kong <akong@redhat.com> writes:
> When I check hwrng attributes in sysfs, cat process always gets
> stuck if guest has only 1 vcpu and uses a slow rng backend.
>
> Currently we check if there is any tasks waiting to be run on
> current cpu in rng_dev_read() by need_resched(). But need_resched()
> doesn't work because rng_dev_read() is executing in user context.

I don't understand this explanation?  I'd expect the sysfs process to be
woken by the mutex_unlock().

If we're really high priority (vs. the sysfs process) then I can see why
we'd need schedule_timeout_interruptible() instead of just schedule(),
and in that case, need_resched() would be false too.

You could argue that's intended behaviour, but I can't see how it
happens in the normal case anyway.

What am I missing?

Thanks,
Rusty.

> This patch removed need_resched() and increase delay to 10 jiffies,
> then other tasks can have chance to execute protected code.
> Delaying 1 jiffy also works, but 10 jiffies is safer.
>
> Signed-off-by: Amos Kong <akong@redhat.com>
> ---
>  drivers/char/hw_random/core.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c
> index c591d7e..b5d1b6f 100644
> --- a/drivers/char/hw_random/core.c
> +++ b/drivers/char/hw_random/core.c
> @@ -195,8 +195,7 @@ static ssize_t rng_dev_read(struct file *filp, char __user *buf,
>  
>  		mutex_unlock(&rng_mutex);
>  
> -		if (need_resched())
> -			schedule_timeout_interruptible(1);
> +		schedule_timeout_interruptible(10);
>  
>  		if (signal_pending(current)) {
>  			err = -ERESTARTSYS;
> -- 
> 1.9.3
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Amos Kong Sept. 13, 2014, 5:12 p.m. UTC | #3
On Thu, Sep 11, 2014 at 09:08:03PM +0930, Rusty Russell wrote:
> Amos Kong <akong@redhat.com> writes:
> > When I check hwrng attributes in sysfs, cat process always gets
> > stuck if guest has only 1 vcpu and uses a slow rng backend.
> >
> > Currently we check if there is any tasks waiting to be run on
> > current cpu in rng_dev_read() by need_resched(). But need_resched()
> > doesn't work because rng_dev_read() is executing in user context.
> 
> I don't understand this explanation?  I'd expect the sysfs process to be
> woken by the mutex_unlock().

But actually sysfs process's not woken always, this is they the
process gets stuck.
 
> If we're really high priority (vs. the sysfs process) then I can see why
> we'd need schedule_timeout_interruptible() instead of just schedule(),
> and in that case, need_resched() would be false too.
> 
> You could argue that's intended behaviour, but I can't see how it
> happens in the normal case anyway.
> 
> What am I missing?
> 
> Thanks,
> Rusty.
> 
> > This patch removed need_resched() and increase delay to 10 jiffies,
> > then other tasks can have chance to execute protected code.
> > Delaying 1 jiffy also works, but 10 jiffies is safer.
> >
> > Signed-off-by: Amos Kong <akong@redhat.com>
> > ---
> >  drivers/char/hw_random/core.c | 3 +--
> >  1 file changed, 1 insertion(+), 2 deletions(-)
> >
> > diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c
> > index c591d7e..b5d1b6f 100644
> > --- a/drivers/char/hw_random/core.c
> > +++ b/drivers/char/hw_random/core.c
> > @@ -195,8 +195,7 @@ static ssize_t rng_dev_read(struct file *filp, char __user *buf,
> >  
> >  		mutex_unlock(&rng_mutex);
> >  
> > -		if (need_resched())
> > -			schedule_timeout_interruptible(1);
> > +		schedule_timeout_interruptible(10);
> >  
> >  		if (signal_pending(current)) {
> >  			err = -ERESTARTSYS;
> > -- 
> > 1.9.3
Amos Kong Sept. 14, 2014, 1:12 a.m. UTC | #4
On Sun, Sep 14, 2014 at 01:12:58AM +0800, Amos Kong wrote:
> On Thu, Sep 11, 2014 at 09:08:03PM +0930, Rusty Russell wrote:
> > Amos Kong <akong@redhat.com> writes:
> > > When I check hwrng attributes in sysfs, cat process always gets
> > > stuck if guest has only 1 vcpu and uses a slow rng backend.
> > >
> > > Currently we check if there is any tasks waiting to be run on
> > > current cpu in rng_dev_read() by need_resched(). But need_resched()
> > > doesn't work because rng_dev_read() is executing in user context.
> > 
> > I don't understand this explanation?  I'd expect the sysfs process to be
> > woken by the mutex_unlock().
> 
> But actually sysfs process's not woken always, this is they the
> process gets stuck.

%s/they/why/

Hi Rusty,


Reference:
http://www.linuxgrill.com/anonymous/fire/netfilter/kernel-hacking-HOWTO-2.html

read() syscall of /dev/hwrng will enter into kernel, the read operation is
rng_dev_read(), it's userspace context (not interrupt context).

Userspace context doesn't allow other user contexts run on that CPU,
unless the kernel code sleeps for some reason.


In this case, the need_resched() doesn't work.

My solution is removing need_resched() and use an appropriate delay by 
schedule_timeout_interruptible(10).

Thanks, Amos
  
> > If we're really high priority (vs. the sysfs process) then I can see why
> > we'd need schedule_timeout_interruptible() instead of just schedule(),
> > and in that case, need_resched() would be false too.
> > 
> > You could argue that's intended behaviour, but I can't see how it
> > happens in the normal case anyway.
> > 
> > What am I missing?

> > Thanks,
> > Rusty.
> > 
> > > This patch removed need_resched() and increase delay to 10 jiffies,
> > > then other tasks can have chance to execute protected code.
> > > Delaying 1 jiffy also works, but 10 jiffies is safer.
> > >
> > > Signed-off-by: Amos Kong <akong@redhat.com>
> > > ---
> > >  drivers/char/hw_random/core.c | 3 +--
> > >  1 file changed, 1 insertion(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c
> > > index c591d7e..b5d1b6f 100644
> > > --- a/drivers/char/hw_random/core.c
> > > +++ b/drivers/char/hw_random/core.c
> > > @@ -195,8 +195,7 @@ static ssize_t rng_dev_read(struct file *filp, char __user *buf,
> > >  
> > >  		mutex_unlock(&rng_mutex);
> > >  
> > > -		if (need_resched())
> > > -			schedule_timeout_interruptible(1);
> > > +		schedule_timeout_interruptible(10);
> > >  
> > >  		if (signal_pending(current)) {
> > >  			err = -ERESTARTSYS;
> > > -- 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Amos Kong Sept. 14, 2014, 1:16 a.m. UTC | #5
On Thu, Sep 11, 2014 at 11:38:38AM +0530, Amit Shah wrote:
> On (Wed) 10 Sep 2014 [17:07:07], Amos Kong wrote:
> > When I check hwrng attributes in sysfs, cat process always gets
> > stuck if guest has only 1 vcpu and uses a slow rng backend.
> > 
> > Currently we check if there is any tasks waiting to be run on
> > current cpu in rng_dev_read() by need_resched(). But need_resched()
> > doesn't work because rng_dev_read() is executing in user context.
> > 
> > This patch removed need_resched() and increase delay to 10 jiffies,
> > then other tasks can have chance to execute protected code.
> > Delaying 1 jiffy also works, but 10 jiffies is safer.

Hi Amit,
 
> I'd prefer two patches for this one: one to remove the need_resched()
> check, and the other to increase the timeout.

If Rusty agrees with this fix, I will respin to update the commitlog
with clear description and split the patches to 3.

Thanks for the review.
 
> Anyway,
> 
> Reviewed-by: Amit Shah <amit.shah@redhat.com>

--
                Amos.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Amos Kong Sept. 14, 2014, 2:25 a.m. UTC | #6
On Sun, Sep 14, 2014 at 09:12:08AM +0800, Amos Kong wrote:

...
> > > > diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c
> > > > index c591d7e..b5d1b6f 100644
> > > > --- a/drivers/char/hw_random/core.c
> > > > +++ b/drivers/char/hw_random/core.c
> > > > @@ -195,8 +195,7 @@ static ssize_t rng_dev_read(struct file *filp, char __user *buf,
> > > >  
> > > >  		mutex_unlock(&rng_mutex);
> > > >  
> > > > -		if (need_resched())
> > > > -			schedule_timeout_interruptible(1);
> > > > +		schedule_timeout_interruptible(10);

Problem only occurred in non-smp guest, we can improve it to:

                        if(!is_smp())
                                schedule_timeout_interruptible(10);

is_smp() is only available for arm arch, we need a general one.

> > > >  
> > > >  		if (signal_pending(current)) {
> > > >  			err = -ERESTARTSYS;
> > > > --
Radim Krčmář Sept. 15, 2014, 4:48 p.m. UTC | #7
2014-09-14 10:25+0800, Amos Kong:
> On Sun, Sep 14, 2014 at 09:12:08AM +0800, Amos Kong wrote:
> 
> ...
> > > > > diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c
> > > > > index c591d7e..b5d1b6f 100644
> > > > > --- a/drivers/char/hw_random/core.c
> > > > > +++ b/drivers/char/hw_random/core.c
> > > > > @@ -195,8 +195,7 @@ static ssize_t rng_dev_read(struct file *filp, char __user *buf,
> > > > >  
> > > > >  		mutex_unlock(&rng_mutex);
> > > > >  
> > > > > -		if (need_resched())
> > > > > -			schedule_timeout_interruptible(1);
> > > > > +		schedule_timeout_interruptible(10);

If cond_resched() does not work, it is a bug elsewehere.

> Problem only occurred in non-smp guest, we can improve it to:
> 
>                         if(!is_smp())
>                                 schedule_timeout_interruptible(10);
> 
> is_smp() is only available for arm arch, we need a general one.

(It is num_online_cpus() > 1.)
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Amos Kong Sept. 16, 2014, 12:50 a.m. UTC | #8
CC linux-kernel

Original thread: http://comments.gmane.org/gmane.linux.kernel.virtualization/22775

On Mon, Sep 15, 2014 at 06:48:46PM +0200, Radim Kr?má? wrote:
> 2014-09-14 10:25+0800, Amos Kong:
> > On Sun, Sep 14, 2014 at 09:12:08AM +0800, Amos Kong wrote:
> > 
> > ...
> > > > > > diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c
> > > > > > index c591d7e..b5d1b6f 100644
> > > > > > --- a/drivers/char/hw_random/core.c
> > > > > > +++ b/drivers/char/hw_random/core.c
> > > > > > @@ -195,8 +195,7 @@ static ssize_t rng_dev_read(struct file *filp, char __user *buf,
> > > > > >  
> > > > > >  		mutex_unlock(&rng_mutex);
> > > > > >  
> > > > > > -		if (need_resched())
> > > > > > -			schedule_timeout_interruptible(1);
> > > > > > +		schedule_timeout_interruptible(10);
> 
> If cond_resched() does not work, it is a bug elsewehere.

Thanks for your reply, Jason also told me the TIF_NEED_RESCHED should
be set in this case, then need_resched() returns true.

I will investigate the issue and reply you later.
 
> > Problem only occurred in non-smp guest, we can improve it to:
> > 
> >                         if(!is_smp())
> >                                 schedule_timeout_interruptible(10);
> > 
> > is_smp() is only available for arm arch, we need a general one.
> 
> (It is num_online_cpus() > 1.)
Rusty Russell Sept. 16, 2014, 3:35 p.m. UTC | #9
Amos Kong <akong@redhat.com> writes:
> On Sun, Sep 14, 2014 at 01:12:58AM +0800, Amos Kong wrote:
>> On Thu, Sep 11, 2014 at 09:08:03PM +0930, Rusty Russell wrote:
>> > Amos Kong <akong@redhat.com> writes:
>> > > When I check hwrng attributes in sysfs, cat process always gets
>> > > stuck if guest has only 1 vcpu and uses a slow rng backend.
>> > >
>> > > Currently we check if there is any tasks waiting to be run on
>> > > current cpu in rng_dev_read() by need_resched(). But need_resched()
>> > > doesn't work because rng_dev_read() is executing in user context.
>> > 
>> > I don't understand this explanation?  I'd expect the sysfs process to be
>> > woken by the mutex_unlock().
>> 
>> But actually sysfs process's not woken always, this is they the
>> process gets stuck.
>
> %s/they/why/
>
> Hi Rusty,
>
>
> Reference:
> http://www.linuxgrill.com/anonymous/fire/netfilter/kernel-hacking-HOWTO-2.html

Sure, that was true when I wrote it, and is still true when preempt is
off.

> read() syscall of /dev/hwrng will enter into kernel, the read operation is
> rng_dev_read(), it's userspace context (not interrupt context).
>
> Userspace context doesn't allow other user contexts run on that CPU,
> unless the kernel code sleeps for some reason.

This is true assuming preempt is off, yes.

> In this case, the need_resched() doesn't work.

This is exactly what need_resched() is for: it should return true if
there's another process of sufficient priority waiting to be run.  It
implies that schedule() would run it.

git blame doesn't offer any enlightenment here, as to why we use
schedule_timeout_interruptible() at all.

I would expect mutex_unlock() to wake the other reader.  The code
certainly seems to, so it should now be runnable and need_resched()
should return true.

I suspect something else is happening which makes this "work".

Cheers,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c
index c591d7e..b5d1b6f 100644
--- a/drivers/char/hw_random/core.c
+++ b/drivers/char/hw_random/core.c
@@ -195,8 +195,7 @@  static ssize_t rng_dev_read(struct file *filp, char __user *buf,
 
 		mutex_unlock(&rng_mutex);
 
-		if (need_resched())
-			schedule_timeout_interruptible(1);
+		schedule_timeout_interruptible(10);
 
 		if (signal_pending(current)) {
 			err = -ERESTARTSYS;