Message ID | 1470403241-5428-1-git-send-email-pmladek@suse.com (mailing list archive) |
---|---|
State | Accepted, archived |
Delegated to: | Zhang Rui |
Headers | show |
On Fri, 5 Aug 2016 15:20:41 +0200 Petr Mladek <pmladek@suse.com> wrote: > I have got a zero division error when disabling the forced > idle injection from the intel powerclamp. I did > > echo 0 >/sys/class/thermal/cooling_device48/cur_state > > and got > > [ 986.072632] divide error: 0000 [#1] PREEMPT SMP > [ 986.078989] Modules linked in: > [ 986.083618] CPU: 17 PID: 24967 Comm: kidle_inject/17 Not tainted > 4.7.0-1-default+ #3055 [ 986.093781] Hardware name: Intel > Corporation S2600CP/S2600CP, BIOS RMLSDP.86I.R3.27.D685.1305151734 > 05/15/2013 [ 986.106227] task: ffff880430e1c080 task.stack: > ffff880427ef0000 [ 986.114122] RIP: 0010:[<ffffffff81794859>] > [<ffffffff81794859>] clamp_thread+0x1d9/0x600 [ 986.124609] RSP: > 0018:ffff880427ef3e20 EFLAGS: 00010246 [ 986.131860] RAX: > 0000000000000258 RBX: 0000000000000006 RCX: 0000000000000001 > [ 986.141179] RDX: 0000000000000000 RSI: 0000000000000000 RDI: > 0000000000000018 [ 986.150478] RBP: ffff880427ef3ec8 R08: > ffff880427ef0000 R09: 0000000000000002 [ 986.159779] R10: > 0000000000003df2 R11: 0000000000000018 R12: 0000000000000002 > [ 986.169089] R13: 0000000000000000 R14: ffff880427ef0000 R15: > ffff880427ef0000 [ 986.178388] FS: 0000000000000000(0000) > GS:ffff880435940000(0000) knlGS:0000000000000000 [ 986.188785] CS: > 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 986.196559] CR2: > 00007f1d0caf0000 CR3: 0000000002006000 CR4: 00000000001406e0 > [ 986.205909] Stack: [ 986.209524] ffff8802be897b00 > ffff880430e1c080 0000000000000011 0000006a35959780 [ 986.219236] > 0000000000000011 ffff880427ef0008 0000000000000000 ffff8804359503d0 > [ 986.228966] 0000000100029d93 ffffffff81794140 0000000000000000 > ffffffff05000011 [ 986.238686] Call Trace: [ 986.242825] > [<ffffffff81794140>] ? pkg_state_counter+0x80/0x80 [ 986.250866] > [<ffffffff81794680>] ? powerclamp_set_cur_state+0x180/0x180 > [ 986.259797] [<ffffffff8111d1a9>] kthread+0xc9/0xe0 > [ 986.266682] [<ffffffff8193d69f>] ret_from_fork+0x1f/0x40 > [ 986.274142] [<ffffffff8111d0e0>] ? > kthread_create_on_node+0x180/0x180 [ 986.282869] Code: d1 ea 48 89 > d6 80 3d 6a d0 d4 00 00 ba 64 00 00 00 89 d8 41 0f 45 f5 0f af c2 42 > 8d 14 2e be 31 00 00 00 83 fa 31 0f 42 f2 31 d2 <f7> f6 48 8b 15 9e > 07 87 00 48 8b 3d 97 07 87 00 48 63 f0 83 e8 [ 986.307806] RIP > [<ffffffff81794859>] clamp_thread+0x1d9/0x600 [ 986.315871] RSP > <ffff880427ef3e20> > > RIP points to the following lines: > > compensation = get_compensation(target_ratio); > interval = duration_jiffies*100/(target_ratio+compensation); > > A solution would be to switch the following two commands in > powerclamp_set_cur_state(): > > set_target_ratio = 0; > end_power_clamp(); > > But I think that the zero division might happen also when target_ratio > is non-zero because the compensation might be negative. Therefore > we also check the sum of target_ratio and compensation explicitly. > > Also the compensated_ratio variable is always set. Therefore there > is no need to initialize it. > > Signed-off-by: Petr Mladek <pmladek@suse.com> Acked-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Thanks > --- > Changes against v1: > > + Also set_target_ratio to 0 after the threads are stopped. > > drivers/thermal/intel_powerclamp.c | 11 +++++++---- > 1 file changed, 7 insertions(+), 4 deletions(-) > > diff --git a/drivers/thermal/intel_powerclamp.c > b/drivers/thermal/intel_powerclamp.c index 015ce2eb6eb7..0e4dc0afcfd2 > 100644 --- a/drivers/thermal/intel_powerclamp.c > +++ b/drivers/thermal/intel_powerclamp.c > @@ -388,7 +388,7 @@ static int clamp_thread(void *arg) > int sleeptime; > unsigned long target_jiffies; > unsigned int guard; > - unsigned int compensation = 0; > + unsigned int compensated_ratio; > int interval; /* jiffies to sleep for each attempt */ > unsigned int duration_jiffies = > msecs_to_jiffies(duration); unsigned int window_size_now; > @@ -409,8 +409,11 @@ static int clamp_thread(void *arg) > * c-states, thus we need to compensate the injected > idle ratio > * to achieve the actual target reported by the HW. > */ > - compensation = get_compensation(target_ratio); > - interval = > duration_jiffies*100/(target_ratio+compensation); > + compensated_ratio = target_ratio + > + get_compensation(target_ratio); > + if (compensated_ratio <= 0) > + compensated_ratio = 1; > + interval = duration_jiffies * 100 / > compensated_ratio; > /* align idle time */ > target_jiffies = roundup(jiffies, interval); > @@ -647,8 +650,8 @@ static int powerclamp_set_cur_state(struct > thermal_cooling_device *cdev, goto exit_set; > } else if (set_target_ratio > 0 && new_target_ratio > == 0) { pr_info("Stop forced idle injection\n"); > - set_target_ratio = 0; > end_power_clamp(); > + set_target_ratio = 0; > } else /* adjust currently running */ { > set_target_ratio = new_target_ratio; > /* make new set_target_ratio visible to other cpus */ [Jacob Pan] -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/thermal/intel_powerclamp.c b/drivers/thermal/intel_powerclamp.c index 015ce2eb6eb7..0e4dc0afcfd2 100644 --- a/drivers/thermal/intel_powerclamp.c +++ b/drivers/thermal/intel_powerclamp.c @@ -388,7 +388,7 @@ static int clamp_thread(void *arg) int sleeptime; unsigned long target_jiffies; unsigned int guard; - unsigned int compensation = 0; + unsigned int compensated_ratio; int interval; /* jiffies to sleep for each attempt */ unsigned int duration_jiffies = msecs_to_jiffies(duration); unsigned int window_size_now; @@ -409,8 +409,11 @@ static int clamp_thread(void *arg) * c-states, thus we need to compensate the injected idle ratio * to achieve the actual target reported by the HW. */ - compensation = get_compensation(target_ratio); - interval = duration_jiffies*100/(target_ratio+compensation); + compensated_ratio = target_ratio + + get_compensation(target_ratio); + if (compensated_ratio <= 0) + compensated_ratio = 1; + interval = duration_jiffies * 100 / compensated_ratio; /* align idle time */ target_jiffies = roundup(jiffies, interval); @@ -647,8 +650,8 @@ static int powerclamp_set_cur_state(struct thermal_cooling_device *cdev, goto exit_set; } else if (set_target_ratio > 0 && new_target_ratio == 0) { pr_info("Stop forced idle injection\n"); - set_target_ratio = 0; end_power_clamp(); + set_target_ratio = 0; } else /* adjust currently running */ { set_target_ratio = new_target_ratio; /* make new set_target_ratio visible to other cpus */
I have got a zero division error when disabling the forced idle injection from the intel powerclamp. I did echo 0 >/sys/class/thermal/cooling_device48/cur_state and got [ 986.072632] divide error: 0000 [#1] PREEMPT SMP [ 986.078989] Modules linked in: [ 986.083618] CPU: 17 PID: 24967 Comm: kidle_inject/17 Not tainted 4.7.0-1-default+ #3055 [ 986.093781] Hardware name: Intel Corporation S2600CP/S2600CP, BIOS RMLSDP.86I.R3.27.D685.1305151734 05/15/2013 [ 986.106227] task: ffff880430e1c080 task.stack: ffff880427ef0000 [ 986.114122] RIP: 0010:[<ffffffff81794859>] [<ffffffff81794859>] clamp_thread+0x1d9/0x600 [ 986.124609] RSP: 0018:ffff880427ef3e20 EFLAGS: 00010246 [ 986.131860] RAX: 0000000000000258 RBX: 0000000000000006 RCX: 0000000000000001 [ 986.141179] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000018 [ 986.150478] RBP: ffff880427ef3ec8 R08: ffff880427ef0000 R09: 0000000000000002 [ 986.159779] R10: 0000000000003df2 R11: 0000000000000018 R12: 0000000000000002 [ 986.169089] R13: 0000000000000000 R14: ffff880427ef0000 R15: ffff880427ef0000 [ 986.178388] FS: 0000000000000000(0000) GS:ffff880435940000(0000) knlGS:0000000000000000 [ 986.188785] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 986.196559] CR2: 00007f1d0caf0000 CR3: 0000000002006000 CR4: 00000000001406e0 [ 986.205909] Stack: [ 986.209524] ffff8802be897b00 ffff880430e1c080 0000000000000011 0000006a35959780 [ 986.219236] 0000000000000011 ffff880427ef0008 0000000000000000 ffff8804359503d0 [ 986.228966] 0000000100029d93 ffffffff81794140 0000000000000000 ffffffff05000011 [ 986.238686] Call Trace: [ 986.242825] [<ffffffff81794140>] ? pkg_state_counter+0x80/0x80 [ 986.250866] [<ffffffff81794680>] ? powerclamp_set_cur_state+0x180/0x180 [ 986.259797] [<ffffffff8111d1a9>] kthread+0xc9/0xe0 [ 986.266682] [<ffffffff8193d69f>] ret_from_fork+0x1f/0x40 [ 986.274142] [<ffffffff8111d0e0>] ? kthread_create_on_node+0x180/0x180 [ 986.282869] Code: d1 ea 48 89 d6 80 3d 6a d0 d4 00 00 ba 64 00 00 00 89 d8 41 0f 45 f5 0f af c2 42 8d 14 2e be 31 00 00 00 83 fa 31 0f 42 f2 31 d2 <f7> f6 48 8b 15 9e 07 87 00 48 8b 3d 97 07 87 00 48 63 f0 83 e8 [ 986.307806] RIP [<ffffffff81794859>] clamp_thread+0x1d9/0x600 [ 986.315871] RSP <ffff880427ef3e20> RIP points to the following lines: compensation = get_compensation(target_ratio); interval = duration_jiffies*100/(target_ratio+compensation); A solution would be to switch the following two commands in powerclamp_set_cur_state(): set_target_ratio = 0; end_power_clamp(); But I think that the zero division might happen also when target_ratio is non-zero because the compensation might be negative. Therefore we also check the sum of target_ratio and compensation explicitly. Also the compensated_ratio variable is always set. Therefore there is no need to initialize it. Signed-off-by: Petr Mladek <pmladek@suse.com> --- Changes against v1: + Also set_target_ratio to 0 after the threads are stopped. drivers/thermal/intel_powerclamp.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-)