Message ID | 20201016173020.12686-1-jassisinghbrar@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mailbox: avoid timer start from callback | expand |
On Fri, Oct 16, 2020 at 12:30:20PM -0500, jassisinghbrar@gmail.com wrote: > From: Jassi Brar <jaswinder.singh@linaro.org> > > If the txdone is done by polling, it is possible for msg_submit() to start > the timer while txdone_hrtimer() callback is running. If the timer needs > recheduling, it could already be enqueued by the time hrtimer_forward_now() > is called, leading hrtimer to loudly complain. > > WARNING: CPU: 3 PID: 74 at kernel/time/hrtimer.c:932 hrtimer_forward+0xc4/0x110 > CPU: 3 PID: 74 Comm: kworker/u8:1 Not tainted 5.9.0-rc2-00236-gd3520067d01c-dirty #5 > Hardware name: Libre Computer AML-S805X-AC (DT) > Workqueue: events_freezable_power_ thermal_zone_device_check > pstate: 20000085 (nzCv daIf -PAN -UAO BTYPE=--) > pc : hrtimer_forward+0xc4/0x110 > lr : txdone_hrtimer+0xf8/0x118 > [...] > > This can be fixed by not starting the timer from the callback path. Which > requires the timer reloading as long as any message is queued on the > channel, and not just when current tx is not done yet. > I came to similar conclusion and was testing something similar. You bet me. Since we have single timer and multiple channels, each time a message is enqueued on any channel, timer gets added which is wrong. Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> I tested this patch too by reverting offending commit in -next, so Tested-by: Sudeep Holla <sudeep.holla@arm.com> You seem to have dropped the Fixes tags. Is that intentional ? If so, any particular reasons. I think it is stable material and better to have fixes tag so that it gets added to stable trees. -- Regards, Sudeep
On Fri, Oct 16, 2020 at 12:50 PM Sudeep Holla <sudeep.holla@arm.com> wrote: > > On Fri, Oct 16, 2020 at 12:30:20PM -0500, jassisinghbrar@gmail.com wrote: > > From: Jassi Brar <jaswinder.singh@linaro.org> > > > > If the txdone is done by polling, it is possible for msg_submit() to start > > the timer while txdone_hrtimer() callback is running. If the timer needs > > recheduling, it could already be enqueued by the time hrtimer_forward_now() > > is called, leading hrtimer to loudly complain. > > > > WARNING: CPU: 3 PID: 74 at kernel/time/hrtimer.c:932 hrtimer_forward+0xc4/0x110 > > CPU: 3 PID: 74 Comm: kworker/u8:1 Not tainted 5.9.0-rc2-00236-gd3520067d01c-dirty #5 > > Hardware name: Libre Computer AML-S805X-AC (DT) > > Workqueue: events_freezable_power_ thermal_zone_device_check > > pstate: 20000085 (nzCv daIf -PAN -UAO BTYPE=--) > > pc : hrtimer_forward+0xc4/0x110 > > lr : txdone_hrtimer+0xf8/0x118 > > [...] > > > > This can be fixed by not starting the timer from the callback path. Which > > requires the timer reloading as long as any message is queued on the > > channel, and not just when current tx is not done yet. > > > > I came to similar conclusion and was testing something similar. You bet > me. Since we have single timer and multiple channels, each time a message > is enqueued on any channel, timer gets added which is wrong. > > Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> > > I tested this patch too by reverting offending commit in -next, so > > Tested-by: Sudeep Holla <sudeep.holla@arm.com> > > You seem to have dropped the Fixes tags. Is that intentional ? If so, > any particular reasons. I think it is stable material and better to have > fixes tag so that it gets added to stable trees. > Thanks for testing. I will decorate it appropriately once I have Jerome's tested-by too. -jassi
On Fri 16 Oct 2020 at 19:30, jassisinghbrar@gmail.com wrote: > From: Jassi Brar <jaswinder.singh@linaro.org> > > If the txdone is done by polling, it is possible for msg_submit() to start > the timer while txdone_hrtimer() callback is running. If the timer needs > recheduling, it could already be enqueued by the time hrtimer_forward_now() > is called, leading hrtimer to loudly complain. > > WARNING: CPU: 3 PID: 74 at kernel/time/hrtimer.c:932 hrtimer_forward+0xc4/0x110 > CPU: 3 PID: 74 Comm: kworker/u8:1 Not tainted 5.9.0-rc2-00236-gd3520067d01c-dirty #5 > Hardware name: Libre Computer AML-S805X-AC (DT) > Workqueue: events_freezable_power_ thermal_zone_device_check > pstate: 20000085 (nzCv daIf -PAN -UAO BTYPE=--) > pc : hrtimer_forward+0xc4/0x110 > lr : txdone_hrtimer+0xf8/0x118 > [...] > > This can be fixed by not starting the timer from the callback path. Which > requires the timer reloading as long as any message is queued on the > channel, and not just when current tx is not done yet. > > Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org> > --- > drivers/mailbox/mailbox.c | 12 +++++++----- > 1 file changed, 7 insertions(+), 5 deletions(-) > > diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c > index 0b821a5b2db8..a093a6ecaa66 100644 > --- a/drivers/mailbox/mailbox.c > +++ b/drivers/mailbox/mailbox.c > @@ -82,9 +82,12 @@ static void msg_submit(struct mbox_chan *chan) > exit: > spin_unlock_irqrestore(&chan->lock, flags); > > - if (!err && (chan->txdone_method & TXDONE_BY_POLL)) > - /* kick start the timer immediately to avoid delays */ > - hrtimer_start(&chan->mbox->poll_hrt, 0, HRTIMER_MODE_REL); > + /* kick start the timer immediately to avoid delays */ > + if (!err && (chan->txdone_method & TXDONE_BY_POLL)) { > + /* but only if not already active */ It would solve the problem I reported as well but instead of running the check immediately (timer with value 0), we will have to wait for the next of the timer, it is already started. IOW, there might be a delay now. I don't know if this important for the mailbox - the existing comments in the code suggested it was. > + if (!hrtimer_active(&chan->mbox->poll_hrt)) > + hrtimer_start(&chan->mbox->poll_hrt, 0, HRTIMER_MODE_REL); > + } > } > > static void tx_tick(struct mbox_chan *chan, int r) > @@ -122,11 +125,10 @@ static enum hrtimer_restart txdone_hrtimer(struct hrtimer *hrtimer) > struct mbox_chan *chan = &mbox->chans[i]; > > if (chan->active_req && chan->cl) { > + resched = true; > txdone = chan->mbox->ops->last_tx_done(chan); > if (txdone) > tx_tick(chan, 0); > - else > - resched = true; > } > }
On Fri, Oct 16, 2020 at 1:38 PM Jerome Brunet <jbrunet@baylibre.com> wrote: > > > On Fri 16 Oct 2020 at 19:30, jassisinghbrar@gmail.com wrote: > > > From: Jassi Brar <jaswinder.singh@linaro.org> > > > > If the txdone is done by polling, it is possible for msg_submit() to start > > the timer while txdone_hrtimer() callback is running. If the timer needs > > recheduling, it could already be enqueued by the time hrtimer_forward_now() > > is called, leading hrtimer to loudly complain. > > > > WARNING: CPU: 3 PID: 74 at kernel/time/hrtimer.c:932 hrtimer_forward+0xc4/0x110 > > CPU: 3 PID: 74 Comm: kworker/u8:1 Not tainted 5.9.0-rc2-00236-gd3520067d01c-dirty #5 > > Hardware name: Libre Computer AML-S805X-AC (DT) > > Workqueue: events_freezable_power_ thermal_zone_device_check > > pstate: 20000085 (nzCv daIf -PAN -UAO BTYPE=--) > > pc : hrtimer_forward+0xc4/0x110 > > lr : txdone_hrtimer+0xf8/0x118 > > [...] > > > > This can be fixed by not starting the timer from the callback path. Which > > requires the timer reloading as long as any message is queued on the > > channel, and not just when current tx is not done yet. > > > > Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org> > > --- > > drivers/mailbox/mailbox.c | 12 +++++++----- > > 1 file changed, 7 insertions(+), 5 deletions(-) > > > > diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c > > index 0b821a5b2db8..a093a6ecaa66 100644 > > --- a/drivers/mailbox/mailbox.c > > +++ b/drivers/mailbox/mailbox.c > > @@ -82,9 +82,12 @@ static void msg_submit(struct mbox_chan *chan) > > exit: > > spin_unlock_irqrestore(&chan->lock, flags); > > > > - if (!err && (chan->txdone_method & TXDONE_BY_POLL)) > > - /* kick start the timer immediately to avoid delays */ > > - hrtimer_start(&chan->mbox->poll_hrt, 0, HRTIMER_MODE_REL); > > + /* kick start the timer immediately to avoid delays */ > > + if (!err && (chan->txdone_method & TXDONE_BY_POLL)) { > > + /* but only if not already active */ > > It would solve the problem I reported as well but instead of running the > check immediately (timer with value 0), we will have to wait for the > next of the timer, it is already started. IOW, there might be a delay > now. I don't know if this important for the mailbox - the existing > comments in the code suggested it was. > That comment is for when the first message is queued on the channel, which remains unimpacted. So, do I have your tested/acked by ? thnx,
On Fri 16 Oct 2020 at 20:45, Jassi Brar <jassisinghbrar@gmail.com> wrote: > On Fri, Oct 16, 2020 at 1:38 PM Jerome Brunet <jbrunet@baylibre.com> wrote: >> >> >> On Fri 16 Oct 2020 at 19:30, jassisinghbrar@gmail.com wrote: >> >> > From: Jassi Brar <jaswinder.singh@linaro.org> >> > >> > If the txdone is done by polling, it is possible for msg_submit() to start >> > the timer while txdone_hrtimer() callback is running. If the timer needs >> > recheduling, it could already be enqueued by the time hrtimer_forward_now() >> > is called, leading hrtimer to loudly complain. >> > >> > WARNING: CPU: 3 PID: 74 at kernel/time/hrtimer.c:932 hrtimer_forward+0xc4/0x110 >> > CPU: 3 PID: 74 Comm: kworker/u8:1 Not tainted 5.9.0-rc2-00236-gd3520067d01c-dirty #5 >> > Hardware name: Libre Computer AML-S805X-AC (DT) >> > Workqueue: events_freezable_power_ thermal_zone_device_check >> > pstate: 20000085 (nzCv daIf -PAN -UAO BTYPE=--) >> > pc : hrtimer_forward+0xc4/0x110 >> > lr : txdone_hrtimer+0xf8/0x118 >> > [...] >> > >> > This can be fixed by not starting the timer from the callback path. Which >> > requires the timer reloading as long as any message is queued on the >> > channel, and not just when current tx is not done yet. >> > >> > Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org> >> > --- >> > drivers/mailbox/mailbox.c | 12 +++++++----- >> > 1 file changed, 7 insertions(+), 5 deletions(-) >> > >> > diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c >> > index 0b821a5b2db8..a093a6ecaa66 100644 >> > --- a/drivers/mailbox/mailbox.c >> > +++ b/drivers/mailbox/mailbox.c >> > @@ -82,9 +82,12 @@ static void msg_submit(struct mbox_chan *chan) >> > exit: >> > spin_unlock_irqrestore(&chan->lock, flags); >> > >> > - if (!err && (chan->txdone_method & TXDONE_BY_POLL)) >> > - /* kick start the timer immediately to avoid delays */ >> > - hrtimer_start(&chan->mbox->poll_hrt, 0, HRTIMER_MODE_REL); >> > + /* kick start the timer immediately to avoid delays */ >> > + if (!err && (chan->txdone_method & TXDONE_BY_POLL)) { >> > + /* but only if not already active */ >> >> It would solve the problem I reported as well but instead of running the >> check immediately (timer with value 0), we will have to wait for the >> next of the timer, it is already started. IOW, there might be a delay >> now. I don't know if this important for the mailbox - the existing >> comments in the code suggested it was. >> > That comment is for when the first message is queued on the channel, > which remains unimpacted. > So, do I have your tested/acked by ? Sure go ahead Acked-by: Jerome Brunet <jbrunet@baylibre.com> Tested-by: Jerome Brunet <jbrunet@baylibre.com> > > thnx,
diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c index 0b821a5b2db8..a093a6ecaa66 100644 --- a/drivers/mailbox/mailbox.c +++ b/drivers/mailbox/mailbox.c @@ -82,9 +82,12 @@ static void msg_submit(struct mbox_chan *chan) exit: spin_unlock_irqrestore(&chan->lock, flags); - if (!err && (chan->txdone_method & TXDONE_BY_POLL)) - /* kick start the timer immediately to avoid delays */ - hrtimer_start(&chan->mbox->poll_hrt, 0, HRTIMER_MODE_REL); + /* kick start the timer immediately to avoid delays */ + if (!err && (chan->txdone_method & TXDONE_BY_POLL)) { + /* but only if not already active */ + if (!hrtimer_active(&chan->mbox->poll_hrt)) + hrtimer_start(&chan->mbox->poll_hrt, 0, HRTIMER_MODE_REL); + } } static void tx_tick(struct mbox_chan *chan, int r) @@ -122,11 +125,10 @@ static enum hrtimer_restart txdone_hrtimer(struct hrtimer *hrtimer) struct mbox_chan *chan = &mbox->chans[i]; if (chan->active_req && chan->cl) { + resched = true; txdone = chan->mbox->ops->last_tx_done(chan); if (txdone) tx_tick(chan, 0); - else - resched = true; } }