Input: mousedev - add a schedule point in mousedev_write()

Message ID	20181004154749.111595-1-edumazet@google.com (mailing list archive)
State	Accepted
Headers	show Return-Path: <linux-input-owner@kernel.org> From: Eric Dumazet <edumazet@google.com> To: linux-kernel <linux-kernel@vger.kernel.org> Cc: Eric Dumazet <edumazet@google.com>, Eric Dumazet <eric.dumazet@gmail.com>, Dmitry Torokhov <dmitry.torokhov@gmail.com>, linux-input@vger.kernel.org Subject: [PATCH] Input: mousedev - add a schedule point in mousedev_write() Date: Thu, 4 Oct 2018 08:47:49 -0700 Message-Id: <20181004154749.111595-1-edumazet@google.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-input-owner@vger.kernel.org Precedence: bulk
Series	Input: mousedev - add a schedule point in mousedev_write() \| expand Input: mousedev - add a schedule point in mousedev_write()

Eric Dumazet Oct. 4, 2018, 3:47 p.m. UTC

syzbot was able to trigger rcu stalls by calling write()
with large number of bytes.

Add a cond_resched() in the loop to avoid this.

Link: https://lkml.org/lkml/2018/8/23/1106
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot+9436b02171ac0894d33e@syzkaller.appspotmail.com
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: linux-input@vger.kernel.org
---
 drivers/input/mousedev.c | 1 +
 1 file changed, 1 insertion(+)

Dmitry Torokhov Oct. 4, 2018, 6:59 p.m. UTC | #1

Hi Eric,

On Thu, Oct 04, 2018 at 08:47:49AM -0700, Eric Dumazet wrote:
> syzbot was able to trigger rcu stalls by calling write()
> with large number of bytes.
> 
> Add a cond_resched() in the loop to avoid this.

I think this simply masks a deeper issue. The code fetches characters
from userspace in a loop, takes a lock, quickly places response in an
output buffer, and releases interrupt. I do not see why this should
cause stalls as we do not hold spinlock/interrupts off for extended
period of time.

Adding Paul so he can straighten me out...

> 
> Link: https://lkml.org/lkml/2018/8/23/1106
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: syzbot+9436b02171ac0894d33e@syzkaller.appspotmail.com
> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
> Cc: linux-input@vger.kernel.org
> ---
>  drivers/input/mousedev.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/input/mousedev.c b/drivers/input/mousedev.c
> index e08228061bcdd2f97aaadece31d6c83eb7539ae5..412fa71245afe26a7a8ad75705566f83633ba347 100644
> --- a/drivers/input/mousedev.c
> +++ b/drivers/input/mousedev.c
> @@ -707,6 +707,7 @@ static ssize_t mousedev_write(struct file *file, const char __user *buffer,
>  		mousedev_generate_response(client, c);
>  
>  		spin_unlock_irq(&client->packet_lock);
> +		cond_resched();
>  	}
>  
>  	kill_fasync(&client->fasync, SIGIO, POLL_IN);
> -- 
> 2.19.0.605.g01d371f741-goog
> 

Thanks.

Eric Dumazet Oct. 4, 2018, 7:28 p.m. UTC | #2

On Thu, Oct 4, 2018 at 11:59 AM Dmitry Torokhov
<dmitry.torokhov@gmail.com> wrote:
>
> Hi Eric,
>
> On Thu, Oct 04, 2018 at 08:47:49AM -0700, Eric Dumazet wrote:
> > syzbot was able to trigger rcu stalls by calling write()
> > with large number of bytes.
> >
> > Add a cond_resched() in the loop to avoid this.
>
> I think this simply masks a deeper issue. The code fetches characters
> from userspace in a loop, takes a lock, quickly places response in an
> output buffer, and releases interrupt. I do not see why this should
> cause stalls as we do not hold spinlock/interrupts off for extended
> period of time.
>
> Adding Paul so he can straighten me out...
>

Well...

write(fd, buffer, 0x7FFF0000);

Takes between 20 seconds and 2 minutes depending on CONFIG options ....

So either apply my patch, or add a limit on the max count, and
possibly break legitimate user space ?

I dunno...

> >
> > Link: https://lkml.org/lkml/2018/8/23/1106
> > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > Reported-by: syzbot+9436b02171ac0894d33e@syzkaller.appspotmail.com
> > Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
> > Cc: linux-input@vger.kernel.org
> > ---
> >  drivers/input/mousedev.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/drivers/input/mousedev.c b/drivers/input/mousedev.c
> > index e08228061bcdd2f97aaadece31d6c83eb7539ae5..412fa71245afe26a7a8ad75705566f83633ba347 100644
> > --- a/drivers/input/mousedev.c
> > +++ b/drivers/input/mousedev.c
> > @@ -707,6 +707,7 @@ static ssize_t mousedev_write(struct file *file, const char __user *buffer,
> >               mousedev_generate_response(client, c);
> >
> >               spin_unlock_irq(&client->packet_lock);
> > +             cond_resched();
> >       }
> >
> >       kill_fasync(&client->fasync, SIGIO, POLL_IN);
> > --
> > 2.19.0.605.g01d371f741-goog
> >
>
> Thanks.
>
> --
> Dmitry

Paul E. McKenney Oct. 4, 2018, 7:34 p.m. UTC | #3

On Thu, Oct 04, 2018 at 11:59:49AM -0700, Dmitry Torokhov wrote:
> Hi Eric,
> 
> On Thu, Oct 04, 2018 at 08:47:49AM -0700, Eric Dumazet wrote:
> > syzbot was able to trigger rcu stalls by calling write()
> > with large number of bytes.
> > 
> > Add a cond_resched() in the loop to avoid this.
> 
> I think this simply masks a deeper issue. The code fetches characters
> from userspace in a loop, takes a lock, quickly places response in an
> output buffer, and releases interrupt. I do not see why this should
> cause stalls as we do not hold spinlock/interrupts off for extended
> period of time.
> 
> Adding Paul so he can straighten me out...

If you are running a !PREEMPT kernel, then you need the cond_resched()
to allow the scheduler to choose someone else to run if needed and
to let RCU know that grace periods can end.  Without the cond_resched(),
if you stay in that loop long enough you will get excessive scheduling
latencies and eventually even RCU CPU stall warning splats.

In a PREEMPT (instead of !PREEMPT) kernel, you would be right.  When
preemption is enabled, the scheduler can preempt and RCU can sense
lack of readers from the scheduling-clock interrupt handler.  Which
is why cond_resched() is nothingness in a PREEMPT kernel.

But because people run !PREEMPT as well as PREEMPT kernels, if that loop
can run for a long time, you need that cond_resched().

							Thanx, Paul

> > Link: https://lkml.org/lkml/2018/8/23/1106
> > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > Reported-by: syzbot+9436b02171ac0894d33e@syzkaller.appspotmail.com
> > Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
> > Cc: linux-input@vger.kernel.org
> > ---
> >  drivers/input/mousedev.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/drivers/input/mousedev.c b/drivers/input/mousedev.c
> > index e08228061bcdd2f97aaadece31d6c83eb7539ae5..412fa71245afe26a7a8ad75705566f83633ba347 100644
> > --- a/drivers/input/mousedev.c
> > +++ b/drivers/input/mousedev.c
> > @@ -707,6 +707,7 @@ static ssize_t mousedev_write(struct file *file, const char __user *buffer,
> >  		mousedev_generate_response(client, c);
> >  
> >  		spin_unlock_irq(&client->packet_lock);
> > +		cond_resched();
> >  	}
> >  
> >  	kill_fasync(&client->fasync, SIGIO, POLL_IN);
> > -- 
> > 2.19.0.605.g01d371f741-goog
> > 
> 
> Thanks.
> 
> -- 
> Dmitry
>

Paul E. McKenney Oct. 4, 2018, 7:36 p.m. UTC | #4

On Thu, Oct 04, 2018 at 12:28:56PM -0700, Eric Dumazet wrote:
> On Thu, Oct 4, 2018 at 11:59 AM Dmitry Torokhov
> <dmitry.torokhov@gmail.com> wrote:
> >
> > Hi Eric,
> >
> > On Thu, Oct 04, 2018 at 08:47:49AM -0700, Eric Dumazet wrote:
> > > syzbot was able to trigger rcu stalls by calling write()
> > > with large number of bytes.
> > >
> > > Add a cond_resched() in the loop to avoid this.
> >
> > I think this simply masks a deeper issue. The code fetches characters
> > from userspace in a loop, takes a lock, quickly places response in an
> > output buffer, and releases interrupt. I do not see why this should
> > cause stalls as we do not hold spinlock/interrupts off for extended
> > period of time.
> >
> > Adding Paul so he can straighten me out...
> >
> 
> Well...
> 
> write(fd, buffer, 0x7FFF0000);
> 
> Takes between 20 seconds and 2 minutes depending on CONFIG options ....

And two minutes would get you an RCU CPU stall warning, even on distro
kernels that set the stall-warning time to a full minute (as opposed
to 21 seconds in mainline).

> So either apply my patch, or add a limit on the max count, and
> possibly break legitimate user space ?
> 
> I dunno...

I vote for Eric's patch.  In fact:

Reviewed-by: Paul E. McKenney <paulmck@linux.ibm.com>

> > > Link: https://lkml.org/lkml/2018/8/23/1106
> > > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > > Reported-by: syzbot+9436b02171ac0894d33e@syzkaller.appspotmail.com
> > > Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
> > > Cc: linux-input@vger.kernel.org
> > > ---
> > >  drivers/input/mousedev.c | 1 +
> > >  1 file changed, 1 insertion(+)
> > >
> > > diff --git a/drivers/input/mousedev.c b/drivers/input/mousedev.c
> > > index e08228061bcdd2f97aaadece31d6c83eb7539ae5..412fa71245afe26a7a8ad75705566f83633ba347 100644
> > > --- a/drivers/input/mousedev.c
> > > +++ b/drivers/input/mousedev.c
> > > @@ -707,6 +707,7 @@ static ssize_t mousedev_write(struct file *file, const char __user *buffer,
> > >               mousedev_generate_response(client, c);
> > >
> > >               spin_unlock_irq(&client->packet_lock);
> > > +             cond_resched();
> > >       }
> > >
> > >       kill_fasync(&client->fasync, SIGIO, POLL_IN);
> > > --
> > > 2.19.0.605.g01d371f741-goog
> > >
> >
> > Thanks.
> >
> > --
> > Dmitry
>

Dmitry Torokhov Oct. 4, 2018, 7:38 p.m. UTC | #5

On October 4, 2018 12:28:56 PM PDT, Eric Dumazet <edumazet@google.com> wrote:
>On Thu, Oct 4, 2018 at 11:59 AM Dmitry Torokhov
><dmitry.torokhov@gmail.com> wrote:
>>
>> Hi Eric,
>>
>> On Thu, Oct 04, 2018 at 08:47:49AM -0700, Eric Dumazet wrote:
>> > syzbot was able to trigger rcu stalls by calling write()
>> > with large number of bytes.
>> >
>> > Add a cond_resched() in the loop to avoid this.
>>
>> I think this simply masks a deeper issue. The code fetches characters
>> from userspace in a loop, takes a lock, quickly places response in an
>> output buffer, and releases interrupt. I do not see why this should
>> cause stalls as we do not hold spinlock/interrupts off for extended
>> period of time.
>>
>> Adding Paul so he can straighten me out...
>>
>
>Well...
>
>write(fd, buffer, 0x7FFF0000);
>
>Takes between 20 seconds and 2 minutes depending on CONFIG options ....

That's fine even if it takes a couple of years. We are not holding spinlock for the entirety of this time, so we should get bumped off CPU at some point.

>
>So either apply my patch, or add a limit on the max count, and
>possibly break legitimate user space ?

Legitimate users write a single character at a time and read response, so exciting after, let's say, 32 bytes would be fine. But I still want to understand why we have to do that.

>
>I dunno...
>
>> >
>> > Link: https://lkml.org/lkml/2018/8/23/1106
>> > Signed-off-by: Eric Dumazet <edumazet@google.com>
>> > Reported-by: syzbot+9436b02171ac0894d33e@syzkaller.appspotmail.com
>> > Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
>> > Cc: linux-input@vger.kernel.org
>> > ---
>> >  drivers/input/mousedev.c | 1 +
>> >  1 file changed, 1 insertion(+)
>> >
>> > diff --git a/drivers/input/mousedev.c b/drivers/input/mousedev.c
>> > index
>e08228061bcdd2f97aaadece31d6c83eb7539ae5..412fa71245afe26a7a8ad75705566f83633ba347
>100644
>> > --- a/drivers/input/mousedev.c
>> > +++ b/drivers/input/mousedev.c
>> > @@ -707,6 +707,7 @@ static ssize_t mousedev_write(struct file
>*file, const char __user *buffer,
>> >               mousedev_generate_response(client, c);
>> >
>> >               spin_unlock_irq(&client->packet_lock);
>> > +             cond_resched();
>> >       }
>> >
>> >       kill_fasync(&client->fasync, SIGIO, POLL_IN);
>> > --
>> > 2.19.0.605.g01d371f741-goog
>> >
>>
>> Thanks.
>>
>> --
>> Dmitry


Thanks.

Eric Dumazet Oct. 4, 2018, 7:45 p.m. UTC | #6

On Thu, Oct 4, 2018 at 12:38 PM Dmitry Torokhov
<dmitry.torokhov@gmail.com> wrote:
>
> On October 4, 2018 12:28:56 PM PDT, Eric Dumazet <edumazet@google.com> wrote:
> >On Thu, Oct 4, 2018 at 11:59 AM Dmitry Torokhov
> ><dmitry.torokhov@gmail.com> wrote:
> >>
> >> Hi Eric,
> >>
> >> On Thu, Oct 04, 2018 at 08:47:49AM -0700, Eric Dumazet wrote:
> >> > syzbot was able to trigger rcu stalls by calling write()
> >> > with large number of bytes.
> >> >
> >> > Add a cond_resched() in the loop to avoid this.
> >>
> >> I think this simply masks a deeper issue. The code fetches characters
> >> from userspace in a loop, takes a lock, quickly places response in an
> >> output buffer, and releases interrupt. I do not see why this should
> >> cause stalls as we do not hold spinlock/interrupts off for extended
> >> period of time.
> >>
> >> Adding Paul so he can straighten me out...
> >>
> >
> >Well...
> >
> >write(fd, buffer, 0x7FFF0000);
> >
> >Takes between 20 seconds and 2 minutes depending on CONFIG options ....
>
> That's fine even if it takes a couple of years. We are not holding spinlock for the entirety of this time, so we should get bumped off CPU at some point.

Well, you are saying that we could get rid of all cond_resched() calls
in the kernel.

You should send patches asap ;)

>
> >
> >So either apply my patch, or add a limit on the max count, and
> >possibly break legitimate user space ?
>
> Legitimate users write a single character at a time and read response, so exciting after, let's say, 32 bytes would be fine. But I still want to understand why we have to do that.
>
> >
> >I dunno...
> >
> >> >
> >> > Link: https://lkml.org/lkml/2018/8/23/1106
> >> > Signed-off-by: Eric Dumazet <edumazet@google.com>
> >> > Reported-by: syzbot+9436b02171ac0894d33e@syzkaller.appspotmail.com
> >> > Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
> >> > Cc: linux-input@vger.kernel.org
> >> > ---
> >> >  drivers/input/mousedev.c | 1 +
> >> >  1 file changed, 1 insertion(+)
> >> >
> >> > diff --git a/drivers/input/mousedev.c b/drivers/input/mousedev.c
> >> > index
> >e08228061bcdd2f97aaadece31d6c83eb7539ae5..412fa71245afe26a7a8ad75705566f83633ba347
> >100644
> >> > --- a/drivers/input/mousedev.c
> >> > +++ b/drivers/input/mousedev.c
> >> > @@ -707,6 +707,7 @@ static ssize_t mousedev_write(struct file
> >*file, const char __user *buffer,
> >> >               mousedev_generate_response(client, c);
> >> >
> >> >               spin_unlock_irq(&client->packet_lock);
> >> > +             cond_resched();
> >> >       }
> >> >
> >> >       kill_fasync(&client->fasync, SIGIO, POLL_IN);
> >> > --
> >> > 2.19.0.605.g01d371f741-goog
> >> >
> >>
> >> Thanks.
> >>
> >> --
> >> Dmitry
>
>
> Thanks.
>
> --
> Dmitry

Dmitry Torokhov Oct. 4, 2018, 10:54 p.m. UTC | #7

On Thu, Oct 04, 2018 at 12:34:07PM -0700, Paul E. McKenney wrote:
> On Thu, Oct 04, 2018 at 11:59:49AM -0700, Dmitry Torokhov wrote:
> > Hi Eric,
> > 
> > On Thu, Oct 04, 2018 at 08:47:49AM -0700, Eric Dumazet wrote:
> > > syzbot was able to trigger rcu stalls by calling write()
> > > with large number of bytes.
> > > 
> > > Add a cond_resched() in the loop to avoid this.
> > 
> > I think this simply masks a deeper issue. The code fetches characters
> > from userspace in a loop, takes a lock, quickly places response in an
> > output buffer, and releases interrupt. I do not see why this should
> > cause stalls as we do not hold spinlock/interrupts off for extended
> > period of time.
> > 
> > Adding Paul so he can straighten me out...
> 
> If you are running a !PREEMPT kernel, then you need the cond_resched()
> to allow the scheduler to choose someone else to run if needed and
> to let RCU know that grace periods can end.  Without the cond_resched(),
> if you stay in that loop long enough you will get excessive scheduling
> latencies and eventually even RCU CPU stall warning splats.
> 
> In a PREEMPT (instead of !PREEMPT) kernel, you would be right.  When
> preemption is enabled, the scheduler can preempt and RCU can sense
> lack of readers from the scheduling-clock interrupt handler.  Which
> is why cond_resched() is nothingness in a PREEMPT kernel.
> 
> But because people run !PREEMPT as well as PREEMPT kernels, if that loop
> can run for a long time, you need that cond_resched().

OK, I see. I'll apply the patch then.

I think evdev.c needs similar treatment as it will keep looping while
there is data...

Thanks.

Eric Dumazet Oct. 4, 2018, 11:01 p.m. UTC | #8

On 10/04/2018 03:54 PM, Dmitry Torokhov wrote:

> OK, I see. I'll apply the patch then.

Thanks !

> 
> I think evdev.c needs similar treatment as it will keep looping while
> there is data...

Yeah, presumably other drivers need care as well :/

Input: mousedev - add a schedule point in mousedev_write()

Commit Message

Comments

Patch