diff mbox

[v2] hw/ptimer: Don't wrap around counter for expired timer that uses tick handler

Message ID 20160625123521.16752-1-digetx@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Dmitry Osipenko June 25, 2016, 12:35 p.m. UTC
Software should see timer counter wrap around only after IRQ being triggered.
Change returned counter value to "1" for the expired timer and avoid returning
wrapped around counter value in periodic mode for the timer that has bottom-half
handler setup, assuming it drives timer IRQ.

This fixes regression introduced by the commit 5a50307 ("hw/ptimer: Perform
counter wrap around if timer already expired") on SPARC emulated machine as
reported by Mark Cave-Ayland.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
---
 hw/core/ptimer.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

Comments

Dmitry Osipenko June 25, 2016, 12:47 p.m. UTC | #1
V2: Patch applies cleanly to the QEMU master branch.
Mark Cave-Ayland June 25, 2016, 1:20 p.m. UTC | #2
On 25/06/16 13:35, Dmitry Osipenko wrote:

> Software should see timer counter wrap around only after IRQ being triggered.
> Change returned counter value to "1" for the expired timer and avoid returning
> wrapped around counter value in periodic mode for the timer that has bottom-half
> handler setup, assuming it drives timer IRQ.
>
> This fixes regression introduced by the commit 5a50307 ("hw/ptimer: Perform
> counter wrap around if timer already expired") on SPARC emulated machine as
> reported by Mark Cave-Ayland.
>
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  hw/core/ptimer.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/hw/core/ptimer.c b/hw/core/ptimer.c
> index 05b0c27..8006442 100644
> --- a/hw/core/ptimer.c
> +++ b/hw/core/ptimer.c
> @@ -93,10 +93,10 @@ uint64_t ptimer_get_count(ptimer_state *s)
>          bool oneshot = (s->enabled == 2);
>
>          /* Figure out the current counter value.  */
> -        if (s->period == 0 || (expired && (oneshot || use_icount))) {
> +        if (expired && (oneshot || use_icount || s->bh != NULL)) {
>              /* Prevent timer underflowing if it should already have
>                 triggered.  */
> -            counter = 0;
> +            counter = 1;
>          } else {
>              uint64_t rem;
>              uint64_t div;
> @@ -143,7 +143,9 @@ uint64_t ptimer_get_count(ptimer_state *s)
>
>              if (expired && counter != 0) {
>                  /* Wrap around periodic counter.  */
> -                counter = s->limit - (counter - 1) % s->limit;
> +                counter = s->delta = s->limit - (counter - 1) % s->limit;
> +                /* Re-arm timer according to the wrapped around value.  */
> +                ptimer_reload(s);
>              }
>          }
>      } else {
>

Hi Dmitry,

I ran through all of my OpenBIOS test images for qemu-system-sparc and 
AFAICT this fixes the issue without introducing any further regressions so:

Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>


Many thanks,

Mark.
Dmitry Osipenko June 25, 2016, 1:59 p.m. UTC | #3
On 25.06.2016 16:20, Mark Cave-Ayland wrote:
> On 25/06/16 13:35, Dmitry Osipenko wrote:
> 
>> Software should see timer counter wrap around only after IRQ being triggered.
>> Change returned counter value to "1" for the expired timer and avoid returning
>> wrapped around counter value in periodic mode for the timer that has bottom-half
>> handler setup, assuming it drives timer IRQ.
>>
>> This fixes regression introduced by the commit 5a50307 ("hw/ptimer: Perform
>> counter wrap around if timer already expired") on SPARC emulated machine as
>> reported by Mark Cave-Ayland.
>>
>> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
>> ---
>>  hw/core/ptimer.c | 8 +++++---
>>  1 file changed, 5 insertions(+), 3 deletions(-)
>>
>> diff --git a/hw/core/ptimer.c b/hw/core/ptimer.c
>> index 05b0c27..8006442 100644
>> --- a/hw/core/ptimer.c
>> +++ b/hw/core/ptimer.c
>> @@ -93,10 +93,10 @@ uint64_t ptimer_get_count(ptimer_state *s)
>>          bool oneshot = (s->enabled == 2);
>>
>>          /* Figure out the current counter value.  */
>> -        if (s->period == 0 || (expired && (oneshot || use_icount))) {
>> +        if (expired && (oneshot || use_icount || s->bh != NULL)) {
>>              /* Prevent timer underflowing if it should already have
>>                 triggered.  */
>> -            counter = 0;
>> +            counter = 1;
>>          } else {
>>              uint64_t rem;
>>              uint64_t div;
>> @@ -143,7 +143,9 @@ uint64_t ptimer_get_count(ptimer_state *s)
>>
>>              if (expired && counter != 0) {
>>                  /* Wrap around periodic counter.  */
>> -                counter = s->limit - (counter - 1) % s->limit;
>> +                counter = s->delta = s->limit - (counter - 1) % s->limit;
>> +                /* Re-arm timer according to the wrapped around value.  */
>> +                ptimer_reload(s);
>>              }
>>          }
>>      } else {
>>
> 
> Hi Dmitry,
> 
> I ran through all of my OpenBIOS test images for qemu-system-sparc and AFAICT
> this fixes the issue without introducing any further regressions so:
> 
> Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
> 
> 
> Many thanks,
> 
> Mark.
> 

Great! Thanks for testing it.
Peter Maydell June 27, 2016, 1:27 p.m. UTC | #4
On 25 June 2016 at 13:35, Dmitry Osipenko <digetx@gmail.com> wrote:
> Software should see timer counter wrap around only after IRQ being triggered.
> Change returned counter value to "1" for the expired timer and avoid returning
> wrapped around counter value in periodic mode for the timer that has bottom-half
> handler setup, assuming it drives timer IRQ.
>
> This fixes regression introduced by the commit 5a50307 ("hw/ptimer: Perform
> counter wrap around if timer already expired") on SPARC emulated machine as
> reported by Mark Cave-Ayland.
>
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  hw/core/ptimer.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/hw/core/ptimer.c b/hw/core/ptimer.c
> index 05b0c27..8006442 100644
> --- a/hw/core/ptimer.c
> +++ b/hw/core/ptimer.c
> @@ -93,10 +93,10 @@ uint64_t ptimer_get_count(ptimer_state *s)
>          bool oneshot = (s->enabled == 2);
>
>          /* Figure out the current counter value.  */
> -        if (s->period == 0 || (expired && (oneshot || use_icount))) {
> +        if (expired && (oneshot || use_icount || s->bh != NULL)) {
>              /* Prevent timer underflowing if it should already have
>                 triggered.  */
> -            counter = 0;
> +            counter = 1;
>          } else {
>              uint64_t rem;
>              uint64_t div;

I guess this fixes a regression, but it looks really weird.
Why should the timer behaviour change if there happens to be
a bottom half present? That should be an internal implementation
detail. It's also a bit odd that use_icount is in the check:
that shouldn't generally affect device emulation behaviour...

thanks
-- PMM
Dmitry Osipenko June 27, 2016, 6:26 p.m. UTC | #5
On 27.06.2016 16:27, Peter Maydell wrote:
> I guess this fixes a regression, but it looks really weird.
> Why should the timer behaviour change if there happens to be
> a bottom half present? That should be an internal implementation
> detail. It's also a bit odd that use_icount is in the check:
> that shouldn't generally affect device emulation behaviour...

In case of a polled timer that doesn't have ptimer trigger bottom half callback
setup, we are free to wrap around counter since timer behaviour isn't changed
from ptimer user perspective, as it won't be able to change it's state in the
handler.

I just decided to keep that wraparound feature for a case of a polled free
running timer, this should result in a better distribution of the polled value.
The potential users of that feature are "imx_epit" and "digic" timer device
models. I should have mentioned it in the commit message to avoid confusion, sorry.

It is still an internal implementation detail, not sure what you are meaning.
Could you elaborate, please?

"use_icount" is redundant now and should be omitted, good point.

This patch is supposed to fix IRQ set -> timer expire/counter wraparound
ordering. I'm wondering whether we have same ordering issue with the scheduled
ptimer trigger callback. I can imagine the following scenario:

1) ptimer_tick() -> periodic counter reloaded or set to "0" in oneshot mode and
ptimer_trigger() schedules trigger callback

2) device uses ptimer_get_count() but the scheduled callback still pending
execution.

That should happen in the same QEMU "cycle" to cause a potential issue. Is this
possible? I'm not familiar with how AIO scheduler works.

The potential fix could be: to directly call trigger callback from ptimer_tick()
after changing the delta value.
Peter Maydell June 30, 2016, 3:02 p.m. UTC | #6
On 27 June 2016 at 19:26, Dmitry Osipenko <digetx@gmail.com> wrote:
> On 27.06.2016 16:27, Peter Maydell wrote:
>> I guess this fixes a regression, but it looks really weird.
>> Why should the timer behaviour change if there happens to be
>> a bottom half present? That should be an internal implementation
>> detail. It's also a bit odd that use_icount is in the check:
>> that shouldn't generally affect device emulation behaviour...
>
> In case of a polled timer that doesn't have ptimer trigger bottom half callback
> setup, we are free to wrap around counter since timer behaviour isn't changed
> from ptimer user perspective, as it won't be able to change it's state in the
> handler.
>
> I just decided to keep that wraparound feature for a case of a polled free
> running timer, this should result in a better distribution of the polled value.
> The potential users of that feature are "imx_epit" and "digic" timer device
> models. I should have mentioned it in the commit message to avoid confusion, sorry.
>
> It is still an internal implementation detail, not sure what you are meaning.
> Could you elaborate, please?

What I meant was: ptimer_get_count() is typically called to generate
a value to return from a register. That's a separate thing, conceptually,
from whether the device happens to also trigger an interrupt on timer
expiry by passing a bh to ptimer_init(). So it's very odd for a detail
of interrupt-on-timer-expiry (that there is a bottom half) to affect
the value returned when you read the timer count register.

thanks
-- PMM
Dmitry Osipenko June 30, 2016, 7:01 p.m. UTC | #7
On 30.06.2016 18:02, Peter Maydell wrote:
> On 27 June 2016 at 19:26, Dmitry Osipenko <digetx@gmail.com> wrote:
>> On 27.06.2016 16:27, Peter Maydell wrote:
>>> I guess this fixes a regression, but it looks really weird.
>>> Why should the timer behaviour change if there happens to be
>>> a bottom half present? That should be an internal implementation
>>> detail. It's also a bit odd that use_icount is in the check:
>>> that shouldn't generally affect device emulation behaviour...
>>
>> In case of a polled timer that doesn't have ptimer trigger bottom half callback
>> setup, we are free to wrap around counter since timer behaviour isn't changed
>> from ptimer user perspective, as it won't be able to change it's state in the
>> handler.
>>
>> I just decided to keep that wraparound feature for a case of a polled free
>> running timer, this should result in a better distribution of the polled value.
>> The potential users of that feature are "imx_epit" and "digic" timer device
>> models. I should have mentioned it in the commit message to avoid confusion, sorry.
>>
>> It is still an internal implementation detail, not sure what you are meaning.
>> Could you elaborate, please?
> 
> What I meant was: ptimer_get_count() is typically called to generate
> a value to return from a register. That's a separate thing, conceptually,
> from whether the device happens to also trigger an interrupt on timer
> expiry by passing a bh to ptimer_init(). So it's very odd for a detail
> of interrupt-on-timer-expiry (that there is a bottom half) to affect
> the value returned when you read the timer count register.
> 

In order to handle wraparound correctly, software needs to track the moment of
the wraparound - the interrupt. If software reads wrapped around counter value
before IRQ triggered (ptimer expired), then it would assume that no wraparound
happened and won't perform counter value correction, resulting in periodic
counter "jumping" backwards.

Anything wrong with it? Am I missing something?
Peter Maydell July 1, 2016, 4:36 p.m. UTC | #8
On 30 June 2016 at 20:01, Dmitry Osipenko <digetx@gmail.com> wrote:
> On 30.06.2016 18:02, Peter Maydell wrote:
>> What I meant was: ptimer_get_count() is typically called to generate
>> a value to return from a register. That's a separate thing, conceptually,
>> from whether the device happens to also trigger an interrupt on timer
>> expiry by passing a bh to ptimer_init(). So it's very odd for a detail
>> of interrupt-on-timer-expiry (that there is a bottom half) to affect
>> the value returned when you read the timer count register.

> In order to handle wraparound correctly, software needs to track the moment of
> the wraparound - the interrupt. If software reads wrapped around counter value
> before IRQ triggered (ptimer expired), then it would assume that no wraparound
> happened and won't perform counter value correction, resulting in periodic
> counter "jumping" backwards.

That just says you need particular behaviour between counter reads
and IRQ triggers; it doesn't say that you need the behaviour to be
different if the ptimer code doesn't know about the IRQ trigger.

thanks
-- PMM
Dmitry Osipenko July 1, 2016, 5:49 p.m. UTC | #9
On 01.07.2016 19:36, Peter Maydell wrote:
> On 30 June 2016 at 20:01, Dmitry Osipenko <digetx@gmail.com> wrote:
>> On 30.06.2016 18:02, Peter Maydell wrote:
>>> What I meant was: ptimer_get_count() is typically called to generate
>>> a value to return from a register. That's a separate thing, conceptually,
>>> from whether the device happens to also trigger an interrupt on timer
>>> expiry by passing a bh to ptimer_init(). So it's very odd for a detail
>>> of interrupt-on-timer-expiry (that there is a bottom half) to affect
>>> the value returned when you read the timer count register.
> 
>> In order to handle wraparound correctly, software needs to track the moment of
>> the wraparound - the interrupt. If software reads wrapped around counter value
>> before IRQ triggered (ptimer expired), then it would assume that no wraparound
>> happened and won't perform counter value correction, resulting in periodic
>> counter "jumping" backwards.
> 
> That just says you need particular behaviour between counter reads
> and IRQ triggers; it doesn't say that you need the behaviour to be
> different if the ptimer code doesn't know about the IRQ trigger.
> 

Okay, I already explained the reason for having two different behaviours - to
make polled counter value more distributed when possible. If I understand you
correctly, you don't like it because it is "odd" and I agree that it's a bit clumsy.

So, what we are going to do now? Would you just revert the offending commit or
you have some other suggestions?

I think we still need to change the returned counter value to "1" in case of the
expired timer, since it would result in the deterministic behaviour across of
all of the timers. However, it definitely feels like it should go into the
standalone patch and I can include it into the next iteration of the ptimer patches.
Peter Maydell July 4, 2016, 9:55 a.m. UTC | #10
On 1 July 2016 at 18:49, Dmitry Osipenko <digetx@gmail.com> wrote:
> On 01.07.2016 19:36, Peter Maydell wrote:
>> On 30 June 2016 at 20:01, Dmitry Osipenko <digetx@gmail.com> wrote:
>>> On 30.06.2016 18:02, Peter Maydell wrote:
>>>> What I meant was: ptimer_get_count() is typically called to generate
>>>> a value to return from a register. That's a separate thing, conceptually,
>>>> from whether the device happens to also trigger an interrupt on timer
>>>> expiry by passing a bh to ptimer_init(). So it's very odd for a detail
>>>> of interrupt-on-timer-expiry (that there is a bottom half) to affect
>>>> the value returned when you read the timer count register.
>>
>>> In order to handle wraparound correctly, software needs to track the moment of
>>> the wraparound - the interrupt. If software reads wrapped around counter value
>>> before IRQ triggered (ptimer expired), then it would assume that no wraparound
>>> happened and won't perform counter value correction, resulting in periodic
>>> counter "jumping" backwards.
>>
>> That just says you need particular behaviour between counter reads
>> and IRQ triggers; it doesn't say that you need the behaviour to be
>> different if the ptimer code doesn't know about the IRQ trigger.
>>
>
> Okay, I already explained the reason for having two different behaviours - to
> make polled counter value more distributed when possible. If I understand you
> correctly, you don't like it because it is "odd" and I agree that it's a bit clumsy.

> So, what we are going to do now? Would you just revert the offending commit or
> you have some other suggestions?

Well, we need to fix the regression, but basically I'm kind of
confused at the moment. I haven't invested a lot of time in
trying to understand the timer code, so all I can really do
is say "this does not look like the right thing" and ask you
to come up with a different fix for it.

thanks
-- PMM
Peter Maydell July 7, 2016, 10:53 a.m. UTC | #11
On 4 July 2016 at 10:55, Peter Maydell <peter.maydell@linaro.org> wrote:
> On 1 July 2016 at 18:49, Dmitry Osipenko <digetx@gmail.com> wrote:
>> On 01.07.2016 19:36, Peter Maydell wrote:
>>> On 30 June 2016 at 20:01, Dmitry Osipenko <digetx@gmail.com> wrote:
>>>> On 30.06.2016 18:02, Peter Maydell wrote:
>>>>> What I meant was: ptimer_get_count() is typically called to generate
>>>>> a value to return from a register. That's a separate thing, conceptually,
>>>>> from whether the device happens to also trigger an interrupt on timer
>>>>> expiry by passing a bh to ptimer_init(). So it's very odd for a detail
>>>>> of interrupt-on-timer-expiry (that there is a bottom half) to affect
>>>>> the value returned when you read the timer count register.
>>>
>>>> In order to handle wraparound correctly, software needs to track the moment of
>>>> the wraparound - the interrupt. If software reads wrapped around counter value
>>>> before IRQ triggered (ptimer expired), then it would assume that no wraparound
>>>> happened and won't perform counter value correction, resulting in periodic
>>>> counter "jumping" backwards.
>>>
>>> That just says you need particular behaviour between counter reads
>>> and IRQ triggers; it doesn't say that you need the behaviour to be
>>> different if the ptimer code doesn't know about the IRQ trigger.
>>>
>>
>> Okay, I already explained the reason for having two different behaviours - to
>> make polled counter value more distributed when possible. If I understand you
>> correctly, you don't like it because it is "odd" and I agree that it's a bit clumsy.
>
>> So, what we are going to do now? Would you just revert the offending commit or
>> you have some other suggestions?
>
> Well, we need to fix the regression, but basically I'm kind of
> confused at the moment. I haven't invested a lot of time in
> trying to understand the timer code, so all I can really do
> is say "this does not look like the right thing" and ask you
> to come up with a different fix for it.

My current best guess is that this condition should simply be
"if (expired) {".

thanks
-- PMM
Dmitry Osipenko July 7, 2016, 12:20 p.m. UTC | #12
On 07.07.2016 13:53, Peter Maydell wrote:
> On 4 July 2016 at 10:55, Peter Maydell <peter.maydell@linaro.org> wrote:
>> On 1 July 2016 at 18:49, Dmitry Osipenko <digetx@gmail.com> wrote:
>>> On 01.07.2016 19:36, Peter Maydell wrote:
>>>> On 30 June 2016 at 20:01, Dmitry Osipenko <digetx@gmail.com> wrote:
>>>>> On 30.06.2016 18:02, Peter Maydell wrote:
>>>>>> What I meant was: ptimer_get_count() is typically called to generate
>>>>>> a value to return from a register. That's a separate thing, conceptually,
>>>>>> from whether the device happens to also trigger an interrupt on timer
>>>>>> expiry by passing a bh to ptimer_init(). So it's very odd for a detail
>>>>>> of interrupt-on-timer-expiry (that there is a bottom half) to affect
>>>>>> the value returned when you read the timer count register.
>>>>
>>>>> In order to handle wraparound correctly, software needs to track the moment of
>>>>> the wraparound - the interrupt. If software reads wrapped around counter value
>>>>> before IRQ triggered (ptimer expired), then it would assume that no wraparound
>>>>> happened and won't perform counter value correction, resulting in periodic
>>>>> counter "jumping" backwards.
>>>>
>>>> That just says you need particular behaviour between counter reads
>>>> and IRQ triggers; it doesn't say that you need the behaviour to be
>>>> different if the ptimer code doesn't know about the IRQ trigger.
>>>>
>>>
>>> Okay, I already explained the reason for having two different behaviours - to
>>> make polled counter value more distributed when possible. If I understand you
>>> correctly, you don't like it because it is "odd" and I agree that it's a bit clumsy.
>>
>>> So, what we are going to do now? Would you just revert the offending commit or
>>> you have some other suggestions?
>>
>> Well, we need to fix the regression, but basically I'm kind of
>> confused at the moment. I haven't invested a lot of time in
>> trying to understand the timer code, so all I can really do
>> is say "this does not look like the right thing" and ask you
>> to come up with a different fix for it.
> 

I'm currently leaning to the revert solution. Unfortunately, I haven't had a
chance to look at it yet, will do it today and send patch after.

> My current best guess is that this condition should simply be
> "if (expired) {".
> 

Since you are insisted that wraparound isn't a needed feature - yes, that's
nearly the same what the original code did :)
diff mbox

Patch

diff --git a/hw/core/ptimer.c b/hw/core/ptimer.c
index 05b0c27..8006442 100644
--- a/hw/core/ptimer.c
+++ b/hw/core/ptimer.c
@@ -93,10 +93,10 @@  uint64_t ptimer_get_count(ptimer_state *s)
         bool oneshot = (s->enabled == 2);
 
         /* Figure out the current counter value.  */
-        if (s->period == 0 || (expired && (oneshot || use_icount))) {
+        if (expired && (oneshot || use_icount || s->bh != NULL)) {
             /* Prevent timer underflowing if it should already have
                triggered.  */
-            counter = 0;
+            counter = 1;
         } else {
             uint64_t rem;
             uint64_t div;
@@ -143,7 +143,9 @@  uint64_t ptimer_get_count(ptimer_state *s)
 
             if (expired && counter != 0) {
                 /* Wrap around periodic counter.  */
-                counter = s->limit - (counter - 1) % s->limit;
+                counter = s->delta = s->limit - (counter - 1) % s->limit;
+                /* Re-arm timer according to the wrapped around value.  */
+                ptimer_reload(s);
             }
         }
     } else {