Message ID | 20181004154749.111595-1-edumazet@google.com (mailing list archive) |
---|---|
State | Accepted |
Headers | show |
Series | Input: mousedev - add a schedule point in mousedev_write() | expand |
Hi Eric, On Thu, Oct 04, 2018 at 08:47:49AM -0700, Eric Dumazet wrote: > syzbot was able to trigger rcu stalls by calling write() > with large number of bytes. > > Add a cond_resched() in the loop to avoid this. I think this simply masks a deeper issue. The code fetches characters from userspace in a loop, takes a lock, quickly places response in an output buffer, and releases interrupt. I do not see why this should cause stalls as we do not hold spinlock/interrupts off for extended period of time. Adding Paul so he can straighten me out... > > Link: https://lkml.org/lkml/2018/8/23/1106 > Signed-off-by: Eric Dumazet <edumazet@google.com> > Reported-by: syzbot+9436b02171ac0894d33e@syzkaller.appspotmail.com > Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> > Cc: linux-input@vger.kernel.org > --- > drivers/input/mousedev.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/input/mousedev.c b/drivers/input/mousedev.c > index e08228061bcdd2f97aaadece31d6c83eb7539ae5..412fa71245afe26a7a8ad75705566f83633ba347 100644 > --- a/drivers/input/mousedev.c > +++ b/drivers/input/mousedev.c > @@ -707,6 +707,7 @@ static ssize_t mousedev_write(struct file *file, const char __user *buffer, > mousedev_generate_response(client, c); > > spin_unlock_irq(&client->packet_lock); > + cond_resched(); > } > > kill_fasync(&client->fasync, SIGIO, POLL_IN); > -- > 2.19.0.605.g01d371f741-goog > Thanks.
On Thu, Oct 4, 2018 at 11:59 AM Dmitry Torokhov <dmitry.torokhov@gmail.com> wrote: > > Hi Eric, > > On Thu, Oct 04, 2018 at 08:47:49AM -0700, Eric Dumazet wrote: > > syzbot was able to trigger rcu stalls by calling write() > > with large number of bytes. > > > > Add a cond_resched() in the loop to avoid this. > > I think this simply masks a deeper issue. The code fetches characters > from userspace in a loop, takes a lock, quickly places response in an > output buffer, and releases interrupt. I do not see why this should > cause stalls as we do not hold spinlock/interrupts off for extended > period of time. > > Adding Paul so he can straighten me out... > Well... write(fd, buffer, 0x7FFF0000); Takes between 20 seconds and 2 minutes depending on CONFIG options .... So either apply my patch, or add a limit on the max count, and possibly break legitimate user space ? I dunno... > > > > Link: https://lkml.org/lkml/2018/8/23/1106 > > Signed-off-by: Eric Dumazet <edumazet@google.com> > > Reported-by: syzbot+9436b02171ac0894d33e@syzkaller.appspotmail.com > > Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> > > Cc: linux-input@vger.kernel.org > > --- > > drivers/input/mousedev.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/drivers/input/mousedev.c b/drivers/input/mousedev.c > > index e08228061bcdd2f97aaadece31d6c83eb7539ae5..412fa71245afe26a7a8ad75705566f83633ba347 100644 > > --- a/drivers/input/mousedev.c > > +++ b/drivers/input/mousedev.c > > @@ -707,6 +707,7 @@ static ssize_t mousedev_write(struct file *file, const char __user *buffer, > > mousedev_generate_response(client, c); > > > > spin_unlock_irq(&client->packet_lock); > > + cond_resched(); > > } > > > > kill_fasync(&client->fasync, SIGIO, POLL_IN); > > -- > > 2.19.0.605.g01d371f741-goog > > > > Thanks. > > -- > Dmitry
On Thu, Oct 04, 2018 at 11:59:49AM -0700, Dmitry Torokhov wrote: > Hi Eric, > > On Thu, Oct 04, 2018 at 08:47:49AM -0700, Eric Dumazet wrote: > > syzbot was able to trigger rcu stalls by calling write() > > with large number of bytes. > > > > Add a cond_resched() in the loop to avoid this. > > I think this simply masks a deeper issue. The code fetches characters > from userspace in a loop, takes a lock, quickly places response in an > output buffer, and releases interrupt. I do not see why this should > cause stalls as we do not hold spinlock/interrupts off for extended > period of time. > > Adding Paul so he can straighten me out... If you are running a !PREEMPT kernel, then you need the cond_resched() to allow the scheduler to choose someone else to run if needed and to let RCU know that grace periods can end. Without the cond_resched(), if you stay in that loop long enough you will get excessive scheduling latencies and eventually even RCU CPU stall warning splats. In a PREEMPT (instead of !PREEMPT) kernel, you would be right. When preemption is enabled, the scheduler can preempt and RCU can sense lack of readers from the scheduling-clock interrupt handler. Which is why cond_resched() is nothingness in a PREEMPT kernel. But because people run !PREEMPT as well as PREEMPT kernels, if that loop can run for a long time, you need that cond_resched(). Thanx, Paul > > Link: https://lkml.org/lkml/2018/8/23/1106 > > Signed-off-by: Eric Dumazet <edumazet@google.com> > > Reported-by: syzbot+9436b02171ac0894d33e@syzkaller.appspotmail.com > > Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> > > Cc: linux-input@vger.kernel.org > > --- > > drivers/input/mousedev.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/drivers/input/mousedev.c b/drivers/input/mousedev.c > > index e08228061bcdd2f97aaadece31d6c83eb7539ae5..412fa71245afe26a7a8ad75705566f83633ba347 100644 > > --- a/drivers/input/mousedev.c > > +++ b/drivers/input/mousedev.c > > @@ -707,6 +707,7 @@ static ssize_t mousedev_write(struct file *file, const char __user *buffer, > > mousedev_generate_response(client, c); > > > > spin_unlock_irq(&client->packet_lock); > > + cond_resched(); > > } > > > > kill_fasync(&client->fasync, SIGIO, POLL_IN); > > -- > > 2.19.0.605.g01d371f741-goog > > > > Thanks. > > -- > Dmitry >
On Thu, Oct 04, 2018 at 12:28:56PM -0700, Eric Dumazet wrote: > On Thu, Oct 4, 2018 at 11:59 AM Dmitry Torokhov > <dmitry.torokhov@gmail.com> wrote: > > > > Hi Eric, > > > > On Thu, Oct 04, 2018 at 08:47:49AM -0700, Eric Dumazet wrote: > > > syzbot was able to trigger rcu stalls by calling write() > > > with large number of bytes. > > > > > > Add a cond_resched() in the loop to avoid this. > > > > I think this simply masks a deeper issue. The code fetches characters > > from userspace in a loop, takes a lock, quickly places response in an > > output buffer, and releases interrupt. I do not see why this should > > cause stalls as we do not hold spinlock/interrupts off for extended > > period of time. > > > > Adding Paul so he can straighten me out... > > > > Well... > > write(fd, buffer, 0x7FFF0000); > > Takes between 20 seconds and 2 minutes depending on CONFIG options .... And two minutes would get you an RCU CPU stall warning, even on distro kernels that set the stall-warning time to a full minute (as opposed to 21 seconds in mainline). > So either apply my patch, or add a limit on the max count, and > possibly break legitimate user space ? > > I dunno... I vote for Eric's patch. In fact: Reviewed-by: Paul E. McKenney <paulmck@linux.ibm.com> > > > Link: https://lkml.org/lkml/2018/8/23/1106 > > > Signed-off-by: Eric Dumazet <edumazet@google.com> > > > Reported-by: syzbot+9436b02171ac0894d33e@syzkaller.appspotmail.com > > > Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> > > > Cc: linux-input@vger.kernel.org > > > --- > > > drivers/input/mousedev.c | 1 + > > > 1 file changed, 1 insertion(+) > > > > > > diff --git a/drivers/input/mousedev.c b/drivers/input/mousedev.c > > > index e08228061bcdd2f97aaadece31d6c83eb7539ae5..412fa71245afe26a7a8ad75705566f83633ba347 100644 > > > --- a/drivers/input/mousedev.c > > > +++ b/drivers/input/mousedev.c > > > @@ -707,6 +707,7 @@ static ssize_t mousedev_write(struct file *file, const char __user *buffer, > > > mousedev_generate_response(client, c); > > > > > > spin_unlock_irq(&client->packet_lock); > > > + cond_resched(); > > > } > > > > > > kill_fasync(&client->fasync, SIGIO, POLL_IN); > > > -- > > > 2.19.0.605.g01d371f741-goog > > > > > > > Thanks. > > > > -- > > Dmitry >
On October 4, 2018 12:28:56 PM PDT, Eric Dumazet <edumazet@google.com> wrote: >On Thu, Oct 4, 2018 at 11:59 AM Dmitry Torokhov ><dmitry.torokhov@gmail.com> wrote: >> >> Hi Eric, >> >> On Thu, Oct 04, 2018 at 08:47:49AM -0700, Eric Dumazet wrote: >> > syzbot was able to trigger rcu stalls by calling write() >> > with large number of bytes. >> > >> > Add a cond_resched() in the loop to avoid this. >> >> I think this simply masks a deeper issue. The code fetches characters >> from userspace in a loop, takes a lock, quickly places response in an >> output buffer, and releases interrupt. I do not see why this should >> cause stalls as we do not hold spinlock/interrupts off for extended >> period of time. >> >> Adding Paul so he can straighten me out... >> > >Well... > >write(fd, buffer, 0x7FFF0000); > >Takes between 20 seconds and 2 minutes depending on CONFIG options .... That's fine even if it takes a couple of years. We are not holding spinlock for the entirety of this time, so we should get bumped off CPU at some point. > >So either apply my patch, or add a limit on the max count, and >possibly break legitimate user space ? Legitimate users write a single character at a time and read response, so exciting after, let's say, 32 bytes would be fine. But I still want to understand why we have to do that. > >I dunno... > >> > >> > Link: https://lkml.org/lkml/2018/8/23/1106 >> > Signed-off-by: Eric Dumazet <edumazet@google.com> >> > Reported-by: syzbot+9436b02171ac0894d33e@syzkaller.appspotmail.com >> > Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> >> > Cc: linux-input@vger.kernel.org >> > --- >> > drivers/input/mousedev.c | 1 + >> > 1 file changed, 1 insertion(+) >> > >> > diff --git a/drivers/input/mousedev.c b/drivers/input/mousedev.c >> > index >e08228061bcdd2f97aaadece31d6c83eb7539ae5..412fa71245afe26a7a8ad75705566f83633ba347 >100644 >> > --- a/drivers/input/mousedev.c >> > +++ b/drivers/input/mousedev.c >> > @@ -707,6 +707,7 @@ static ssize_t mousedev_write(struct file >*file, const char __user *buffer, >> > mousedev_generate_response(client, c); >> > >> > spin_unlock_irq(&client->packet_lock); >> > + cond_resched(); >> > } >> > >> > kill_fasync(&client->fasync, SIGIO, POLL_IN); >> > -- >> > 2.19.0.605.g01d371f741-goog >> > >> >> Thanks. >> >> -- >> Dmitry Thanks.
On Thu, Oct 4, 2018 at 12:38 PM Dmitry Torokhov <dmitry.torokhov@gmail.com> wrote: > > On October 4, 2018 12:28:56 PM PDT, Eric Dumazet <edumazet@google.com> wrote: > >On Thu, Oct 4, 2018 at 11:59 AM Dmitry Torokhov > ><dmitry.torokhov@gmail.com> wrote: > >> > >> Hi Eric, > >> > >> On Thu, Oct 04, 2018 at 08:47:49AM -0700, Eric Dumazet wrote: > >> > syzbot was able to trigger rcu stalls by calling write() > >> > with large number of bytes. > >> > > >> > Add a cond_resched() in the loop to avoid this. > >> > >> I think this simply masks a deeper issue. The code fetches characters > >> from userspace in a loop, takes a lock, quickly places response in an > >> output buffer, and releases interrupt. I do not see why this should > >> cause stalls as we do not hold spinlock/interrupts off for extended > >> period of time. > >> > >> Adding Paul so he can straighten me out... > >> > > > >Well... > > > >write(fd, buffer, 0x7FFF0000); > > > >Takes between 20 seconds and 2 minutes depending on CONFIG options .... > > That's fine even if it takes a couple of years. We are not holding spinlock for the entirety of this time, so we should get bumped off CPU at some point. Well, you are saying that we could get rid of all cond_resched() calls in the kernel. You should send patches asap ;) > > > > >So either apply my patch, or add a limit on the max count, and > >possibly break legitimate user space ? > > Legitimate users write a single character at a time and read response, so exciting after, let's say, 32 bytes would be fine. But I still want to understand why we have to do that. > > > > >I dunno... > > > >> > > >> > Link: https://lkml.org/lkml/2018/8/23/1106 > >> > Signed-off-by: Eric Dumazet <edumazet@google.com> > >> > Reported-by: syzbot+9436b02171ac0894d33e@syzkaller.appspotmail.com > >> > Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> > >> > Cc: linux-input@vger.kernel.org > >> > --- > >> > drivers/input/mousedev.c | 1 + > >> > 1 file changed, 1 insertion(+) > >> > > >> > diff --git a/drivers/input/mousedev.c b/drivers/input/mousedev.c > >> > index > >e08228061bcdd2f97aaadece31d6c83eb7539ae5..412fa71245afe26a7a8ad75705566f83633ba347 > >100644 > >> > --- a/drivers/input/mousedev.c > >> > +++ b/drivers/input/mousedev.c > >> > @@ -707,6 +707,7 @@ static ssize_t mousedev_write(struct file > >*file, const char __user *buffer, > >> > mousedev_generate_response(client, c); > >> > > >> > spin_unlock_irq(&client->packet_lock); > >> > + cond_resched(); > >> > } > >> > > >> > kill_fasync(&client->fasync, SIGIO, POLL_IN); > >> > -- > >> > 2.19.0.605.g01d371f741-goog > >> > > >> > >> Thanks. > >> > >> -- > >> Dmitry > > > Thanks. > > -- > Dmitry
On Thu, Oct 04, 2018 at 12:34:07PM -0700, Paul E. McKenney wrote: > On Thu, Oct 04, 2018 at 11:59:49AM -0700, Dmitry Torokhov wrote: > > Hi Eric, > > > > On Thu, Oct 04, 2018 at 08:47:49AM -0700, Eric Dumazet wrote: > > > syzbot was able to trigger rcu stalls by calling write() > > > with large number of bytes. > > > > > > Add a cond_resched() in the loop to avoid this. > > > > I think this simply masks a deeper issue. The code fetches characters > > from userspace in a loop, takes a lock, quickly places response in an > > output buffer, and releases interrupt. I do not see why this should > > cause stalls as we do not hold spinlock/interrupts off for extended > > period of time. > > > > Adding Paul so he can straighten me out... > > If you are running a !PREEMPT kernel, then you need the cond_resched() > to allow the scheduler to choose someone else to run if needed and > to let RCU know that grace periods can end. Without the cond_resched(), > if you stay in that loop long enough you will get excessive scheduling > latencies and eventually even RCU CPU stall warning splats. > > In a PREEMPT (instead of !PREEMPT) kernel, you would be right. When > preemption is enabled, the scheduler can preempt and RCU can sense > lack of readers from the scheduling-clock interrupt handler. Which > is why cond_resched() is nothingness in a PREEMPT kernel. > > But because people run !PREEMPT as well as PREEMPT kernels, if that loop > can run for a long time, you need that cond_resched(). OK, I see. I'll apply the patch then. I think evdev.c needs similar treatment as it will keep looping while there is data... Thanks.
On 10/04/2018 03:54 PM, Dmitry Torokhov wrote: > OK, I see. I'll apply the patch then. Thanks ! > > I think evdev.c needs similar treatment as it will keep looping while > there is data... Yeah, presumably other drivers need care as well :/
diff --git a/drivers/input/mousedev.c b/drivers/input/mousedev.c index e08228061bcdd2f97aaadece31d6c83eb7539ae5..412fa71245afe26a7a8ad75705566f83633ba347 100644 --- a/drivers/input/mousedev.c +++ b/drivers/input/mousedev.c @@ -707,6 +707,7 @@ static ssize_t mousedev_write(struct file *file, const char __user *buffer, mousedev_generate_response(client, c); spin_unlock_irq(&client->packet_lock); + cond_resched(); } kill_fasync(&client->fasync, SIGIO, POLL_IN);
syzbot was able to trigger rcu stalls by calling write() with large number of bytes. Add a cond_resched() in the loop to avoid this. Link: https://lkml.org/lkml/2018/8/23/1106 Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot+9436b02171ac0894d33e@syzkaller.appspotmail.com Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: linux-input@vger.kernel.org --- drivers/input/mousedev.c | 1 + 1 file changed, 1 insertion(+)