Message ID | c2278d36-993b-402f-dd07-635980112bbe@m-reimer.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Sun, Jun 19, 2016 at 02:44:35PM +0200, Manuel Reimer wrote: > Hello, > > while debugging a problem with hid-sony I got stuck with a problem > actually caused by ff-memless. In some situations effect aborting is > delayed, so it may be triggered seconds after all devices have been > destroyed, which causes the kernel to panic. > > The aborting request actually gets received here: > https://github.com/torvalds/linux/blob/master/drivers/input/ff-memless.c#L467 > This "aborting" flag is then handled here: > https://github.com/torvalds/linux/blob/master/drivers/input/ff-memless.c#L376 > But before this line is reached, there is a time check to check if > the effect actually is due to be started: > https://github.com/torvalds/linux/blob/master/drivers/input/ff-memless.c#L359 > > This time check now causes a problem if the effect, which is meant > to be *aborted* was scheduled to be *started* some time in future > and the device is destroyed before this time is reached. I am not clear how this can happen. If effect hasn't actually started playing (i.e. we have FF_EFFECT_STARTED bit set, but FF_EFFECT_PLAYING is not yet set), then when stopping effect we do not need to do anything except clear FF_EFFECT_STARTED (since we did not touch the hardware yet). Now, if FF_EFFECT_PLAYING is set, that means that play_at time is in the past and we won't be skipping this effect in ml_get_combo_effect(). Could you please post a stack trace of the crash you observed? > > My patch fixes this by setting the trigger times to "now" after > setting the aborting flag. This way the following code deals with > aborting the effect immediately without setting a timer for it. > > There may be other ways to fix this, so I would be happy to get some > feedback. > > > Signed-off-by: Manuel Reimer <mail@m-reimer.de> > > --- a/drivers/input/ff-memless.c 2016-05-13 16:06:29.722685021 +0200 > +++ b/drivers/input/ff-memless.c 2016-06-19 14:25:39.790375270 +0200 > @@ -463,9 +463,11 @@ static int ml_ff_playback(struct input_d > } else { > pr_debug("initiated stop\n"); > > - if (test_bit(FF_EFFECT_PLAYING, &state->flags)) > + if (test_bit(FF_EFFECT_PLAYING, &state->flags)) { > __set_bit(FF_EFFECT_ABORTING, &state->flags); > - else > + state->play_at = jiffies; > + state->adj_at = jiffies; > + } else > __clear_bit(FF_EFFECT_STARTED, &state->flags); > } > Thanks.
On 06/20/2016 07:33 PM, Dmitry Torokhov wrote: > I am not clear how this can happen. If effect hasn't actually started > playing (i.e. we have FF_EFFECT_STARTED bit set, but FF_EFFECT_PLAYING > is not yet set), then when stopping effect we do not need to do anything > except clear FF_EFFECT_STARTED (since we did not touch the hardware > yet). That's true, but I think my crash works a bit different. What I'm doing is "hammering" the code with "start playback" events. Maybe not common in normal use, but it shouldn't be possible to crash the kernel with something like this. For the crash to happen, the uploaded effect has to have a replay delay of some seconds. With this, what is actually happening is: - The first playback request is accepted in ml_ff_playback (nonzero value to ml_ff_playback). --> FF_EFFECT_STARTED is set, FF_EFFECT_PLAYING not set --> play_at is in the future - Some time later (play_at reached) the effect actually is started --> Now both, FF_EFFECT_STARTED and FF_EFFECT_PLAYING are set --> play_at is in the past - Now a new playback request for the same effect is accepted in ml_ff_playback --> Still both, FF_EFFECT_STARTED and FF_EFFECT_PLAYING set --> play_at is now in the future!!! => That's the time where the USB plug is pulled => Kernel is trying to stop possible running effects Current situation: - The effect is playing and hasn't stopped playing so far - The same effect is scheduled to be playing again (play_at in future) - Now we get a "stop playback" request (zero value to ml_ff_playback) --> FF_EFFECT_PLAYING still set --> FF_EFFECT_ABORTING will be set --> play_at still in the future - The "FF_EFFECT_ABORTING" request now isn't handled directly in ml_get_combo_effect, as it is filtered out "play_at in future". - A timer is set (as play_at is in future) - Some time later the aborting is triggered with all memory already freed in both, ff-memless and hid-sony, modules. With this knowledge, it now is *very* easy to reproduce this one: - Open fftest on the device - press "4" and "enter", immediately pre-enter a "4" after that - As soon as the controller starts to vibrate, press enter and disconnect the controller - Wait some seconds Crashes *every time* with the hid-sony module. Produces ugly output on dmesg with xpad module. Seems like some strings are read from freed memory in this case. My patch fixes this by setting play_at to "now", so the "FF_EFFECT_ABORTING" is handled immediately in this case. --> No crash Manuel -- To unsubscribe from this list: send the line "unsubscribe linux-input" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, Jun 25, 2016 at 03:28:05PM +0200, Manuel Reimer wrote: > On 06/20/2016 07:33 PM, Dmitry Torokhov wrote: > >I am not clear how this can happen. If effect hasn't actually started > >playing (i.e. we have FF_EFFECT_STARTED bit set, but FF_EFFECT_PLAYING > >is not yet set), then when stopping effect we do not need to do anything > >except clear FF_EFFECT_STARTED (since we did not touch the hardware > >yet). > > That's true, but I think my crash works a bit different. > > What I'm doing is "hammering" the code with "start playback" events. > Maybe not common in normal use, but it shouldn't be possible to > crash the kernel with something like this. > > > For the crash to happen, the uploaded effect has to have a replay > delay of some seconds. With this, what is actually happening is: > > - The first playback request is accepted in ml_ff_playback > (nonzero value to ml_ff_playback). > --> FF_EFFECT_STARTED is set, FF_EFFECT_PLAYING not set > --> play_at is in the future > - Some time later (play_at reached) the effect actually is started > --> Now both, FF_EFFECT_STARTED and FF_EFFECT_PLAYING are set > --> play_at is in the past > - Now a new playback request for the same effect is accepted > in ml_ff_playback > --> Still both, FF_EFFECT_STARTED and FF_EFFECT_PLAYING set > --> play_at is now in the future!!! > > => That's the time where the USB plug is pulled > => Kernel is trying to stop possible running effects > Current situation: > - The effect is playing and hasn't stopped playing so far > - The same effect is scheduled to be playing again > (play_at in future) > > - Now we get a "stop playback" request (zero value to ml_ff_playback) > --> FF_EFFECT_PLAYING still set --> FF_EFFECT_ABORTING will be set > --> play_at still in the future > - The "FF_EFFECT_ABORTING" request now isn't handled directly in > ml_get_combo_effect, as it is filtered out "play_at in future". > - A timer is set (as play_at is in future) > - Some time later the aborting is triggered with all memory already > freed in both, ff-memless and hid-sony, modules. > > > > With this knowledge, it now is *very* easy to reproduce this one: > - Open fftest on the device > - press "4" and "enter", immediately pre-enter a "4" after that > - As soon as the controller starts to vibrate, press enter and > disconnect the controller > - Wait some seconds > > Crashes *every time* with the hid-sony module. > Produces ugly output on dmesg with xpad module. Seems like some > strings are read from freed memory in this case. > > My patch fixes this by setting play_at to "now", so the > "FF_EFFECT_ABORTING" is handled immediately in this case. --> No > crash I see. I however wonder what the proper behavior should be when we basically request to restart the effect. If start delay is 0 then new effect parameters would immediately be applied, so I guess it would be reasonable to say that if we apply new start_delay to the effect then currently playing instance should be stopped... Anssi, what is your take here? Thanks.
25.06.2016, 18:29, Dmitry Torokhov kirjoitti: > On Sat, Jun 25, 2016 at 03:28:05PM +0200, Manuel Reimer wrote: >> On 06/20/2016 07:33 PM, Dmitry Torokhov wrote: >>> I am not clear how this can happen. If effect hasn't actually started >>> playing (i.e. we have FF_EFFECT_STARTED bit set, but FF_EFFECT_PLAYING >>> is not yet set), then when stopping effect we do not need to do anything >>> except clear FF_EFFECT_STARTED (since we did not touch the hardware >>> yet). >> >> That's true, but I think my crash works a bit different. >> >> What I'm doing is "hammering" the code with "start playback" events. >> Maybe not common in normal use, but it shouldn't be possible to >> crash the kernel with something like this. >> >> >> For the crash to happen, the uploaded effect has to have a replay >> delay of some seconds. With this, what is actually happening is: >> >> - The first playback request is accepted in ml_ff_playback >> (nonzero value to ml_ff_playback). >> --> FF_EFFECT_STARTED is set, FF_EFFECT_PLAYING not set >> --> play_at is in the future >> - Some time later (play_at reached) the effect actually is started >> --> Now both, FF_EFFECT_STARTED and FF_EFFECT_PLAYING are set >> --> play_at is in the past >> - Now a new playback request for the same effect is accepted >> in ml_ff_playback >> --> Still both, FF_EFFECT_STARTED and FF_EFFECT_PLAYING set >> --> play_at is now in the future!!! >> >> => That's the time where the USB plug is pulled >> => Kernel is trying to stop possible running effects >> Current situation: >> - The effect is playing and hasn't stopped playing so far >> - The same effect is scheduled to be playing again >> (play_at in future) >> >> - Now we get a "stop playback" request (zero value to ml_ff_playback) >> --> FF_EFFECT_PLAYING still set --> FF_EFFECT_ABORTING will be set >> --> play_at still in the future >> - The "FF_EFFECT_ABORTING" request now isn't handled directly in >> ml_get_combo_effect, as it is filtered out "play_at in future". >> - A timer is set (as play_at is in future) >> - Some time later the aborting is triggered with all memory already >> freed in both, ff-memless and hid-sony, modules. >> >> >> >> With this knowledge, it now is *very* easy to reproduce this one: >> - Open fftest on the device >> - press "4" and "enter", immediately pre-enter a "4" after that >> - As soon as the controller starts to vibrate, press enter and >> disconnect the controller >> - Wait some seconds >> >> Crashes *every time* with the hid-sony module. >> Produces ugly output on dmesg with xpad module. Seems like some >> strings are read from freed memory in this case. >> >> My patch fixes this by setting play_at to "now", so the >> "FF_EFFECT_ABORTING" is handled immediately in this case. --> No >> crash > > I see. I however wonder what the proper behavior should be when we > basically request to restart the effect. If start delay is 0 then new > effect parameters would immediately be applied, so I guess it would be > reasonable to say that if we apply new start_delay to the effect then > currently playing instance should be stopped... > > Anssi, what is your take here? I agree with you, starting a delayed effect should immediately stop the effect if it is already running. So I guess e.g. ml_ff_playback() should set FF_EFFECT_ABORTING in such a case, and ml_get_combo_effect() should process FF_EFFECT_ABORTING effects regardless of their ->play_at being in the future (the latter would fix the original issue too, AFAICS). I guess an easier-to-follow state bit set would be e.g. the following: FF_EFFECT_STARTED: always reflects started/stopped state FF_EFFECT_PLAYING: always reflects hardware state FF_EFFECT_DIRTY: effect requires immediate refresh (stop or modify) which would make it clearer where each is set/cleared as there is no more need for "delayed" handling of STARTED/PLAYING via ABORTING or clearing FF_EFFECT_PLAYING in various places to force an effect refresh.
--- a/drivers/input/ff-memless.c 2016-05-13 16:06:29.722685021 +0200 +++ b/drivers/input/ff-memless.c 2016-06-19 14:25:39.790375270 +0200 @@ -463,9 +463,11 @@ static int ml_ff_playback(struct input_d } else { pr_debug("initiated stop\n"); - if (test_bit(FF_EFFECT_PLAYING, &state->flags)) + if (test_bit(FF_EFFECT_PLAYING, &state->flags)) { __set_bit(FF_EFFECT_ABORTING, &state->flags); - else + state->play_at = jiffies; + state->adj_at = jiffies; + } else __clear_bit(FF_EFFECT_STARTED, &state->flags); }
Hello, while debugging a problem with hid-sony I got stuck with a problem actually caused by ff-memless. In some situations effect aborting is delayed, so it may be triggered seconds after all devices have been destroyed, which causes the kernel to panic. The aborting request actually gets received here: https://github.com/torvalds/linux/blob/master/drivers/input/ff-memless.c#L467 This "aborting" flag is then handled here: https://github.com/torvalds/linux/blob/master/drivers/input/ff-memless.c#L376 But before this line is reached, there is a time check to check if the effect actually is due to be started: https://github.com/torvalds/linux/blob/master/drivers/input/ff-memless.c#L359 This time check now causes a problem if the effect, which is meant to be *aborted* was scheduled to be *started* some time in future and the device is destroyed before this time is reached. My patch fixes this by setting the trigger times to "now" after setting the aborting flag. This way the following code deals with aborting the effect immediately without setting a timer for it. There may be other ways to fix this, so I would be happy to get some feedback. Signed-off-by: Manuel Reimer <mail@m-reimer.de> -- To unsubscribe from this list: send the line "unsubscribe linux-input" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html