diff mbox series

[2/9] softirq: Use sched_clock() based timeout

Message ID 20230505113315.3307723-3-liujian56@huawei.com (mailing list archive)
State Not Applicable
Headers show
Series fix softlockup in run_timer_softirq | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Guessed tree name to be net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 10 this patch: 10
netdev/cc_maintainers warning 1 maintainers not CCed: clingutla@codeaurora.org
netdev/build_clang success Errors and warnings before: 8 this patch: 8
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 10 this patch: 10
netdev/checkpatch warning CHECK: From:/Signed-off-by: email comments mismatch: 'From: Peter Zijlstra <peterz@infradead.org>' != 'Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>' WARNING: line length of 82 exceeds 80 columns
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

liujian (CE) May 5, 2023, 11:33 a.m. UTC
From: Peter Zijlstra <peterz@infradead.org>

Replace the jiffies based timeout with a sched_clock() based one.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Liu Jian <liujian56@huawei.com>
---
 kernel/softirq.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

Comments

Jason Xing May 8, 2023, 4:08 a.m. UTC | #1
On Fri, May 5, 2023 at 7:25 PM Liu Jian <liujian56@huawei.com> wrote:
>
> From: Peter Zijlstra <peterz@infradead.org>
>
> Replace the jiffies based timeout with a sched_clock() based one.
>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Signed-off-by: Liu Jian <liujian56@huawei.com>
> ---
>  kernel/softirq.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index bff5debf6ce6..59f16a9af5d1 100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -27,6 +27,7 @@
>  #include <linux/tick.h>
>  #include <linux/irq.h>
>  #include <linux/wait_bit.h>
> +#include <linux/sched/clock.h>
>
>  #include <asm/softirq_stack.h>
>
> @@ -489,7 +490,7 @@ asmlinkage __visible void do_softirq(void)
>   * we want to handle softirqs as soon as possible, but they
>   * should not be able to lock up the box.
>   */
> -#define MAX_SOFTIRQ_TIME  msecs_to_jiffies(2)
> +#define MAX_SOFTIRQ_TIME       (2 * NSEC_PER_MSEC)

I wonder if it affects those servers that set HZ to some different
values rather than 1000 as default.

Thanks,
Jason

>  #define MAX_SOFTIRQ_RESTART 10
>
>  #ifdef CONFIG_TRACE_IRQFLAGS
> @@ -527,9 +528,9 @@ static inline void lockdep_softirq_end(bool in_hardirq) { }
>
>  asmlinkage __visible void __softirq_entry __do_softirq(void)
>  {
> -       unsigned long end = jiffies + MAX_SOFTIRQ_TIME;
>         unsigned long old_flags = current->flags;
>         int max_restart = MAX_SOFTIRQ_RESTART;
> +       u64 start = sched_clock();
>         struct softirq_action *h;
>         unsigned long pending;
>         unsigned int vec_nr;
> @@ -584,7 +585,7 @@ asmlinkage __visible void __softirq_entry __do_softirq(void)
>
>         pending = local_softirq_pending();
>         if (pending) {
> -               if (time_before(jiffies, end) && !need_resched() &&
> +               if (sched_clock() - start < MAX_SOFTIRQ_TIME && !need_resched() &&
>                     --max_restart)
>                         goto restart;
>
> --
> 2.34.1
>
>
Thomas Gleixner May 8, 2023, 5:51 p.m. UTC | #2
On Mon, May 08 2023 at 12:08, Jason Xing wrote:
> On Fri, May 5, 2023 at 7:25 PM Liu Jian <liujian56@huawei.com> wrote:
>> @@ -489,7 +490,7 @@ asmlinkage __visible void do_softirq(void)
>>   * we want to handle softirqs as soon as possible, but they
>>   * should not be able to lock up the box.
>>   */
>> -#define MAX_SOFTIRQ_TIME  msecs_to_jiffies(2)
>> +#define MAX_SOFTIRQ_TIME       (2 * NSEC_PER_MSEC)
>
> I wonder if it affects those servers that set HZ to some different
> values rather than 1000 as default.

The result of msecs_to_jiffies(2) for different HZ values:

HZ=100     1
HZ=250     1
HZ=1000    2

So depending on when the softirq processing starts, this gives the
following ranges in which the timeout ends:

HZ=100    0 - 10ms
HZ=250    0 -  4ms
HZ=1000   1 -  2ms

But as the various softirq handlers have their own notion of timeouts,
loop limits etc. and the timeout is only checked after _all_ pending
bits of each iteration have been processed, the outcome of this is all
lottery.

Due to that the sched_clock() change per se won't have too much impact,
but if the overall changes to consolidate the break conditions are in
place, I think it will have observable effects.

Making this consistent is definitely a good thing, but it won't solve
the underlying problem of soft interrupt processing at all.

We definitely need to spend more thoughts on pulling things out of soft
interrupt context so that these functionalities get under proper
resource control by the scheduler.

Thanks,

        tglx
diff mbox series

Patch

diff --git a/kernel/softirq.c b/kernel/softirq.c
index bff5debf6ce6..59f16a9af5d1 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -27,6 +27,7 @@ 
 #include <linux/tick.h>
 #include <linux/irq.h>
 #include <linux/wait_bit.h>
+#include <linux/sched/clock.h>
 
 #include <asm/softirq_stack.h>
 
@@ -489,7 +490,7 @@  asmlinkage __visible void do_softirq(void)
  * we want to handle softirqs as soon as possible, but they
  * should not be able to lock up the box.
  */
-#define MAX_SOFTIRQ_TIME  msecs_to_jiffies(2)
+#define MAX_SOFTIRQ_TIME	(2 * NSEC_PER_MSEC)
 #define MAX_SOFTIRQ_RESTART 10
 
 #ifdef CONFIG_TRACE_IRQFLAGS
@@ -527,9 +528,9 @@  static inline void lockdep_softirq_end(bool in_hardirq) { }
 
 asmlinkage __visible void __softirq_entry __do_softirq(void)
 {
-	unsigned long end = jiffies + MAX_SOFTIRQ_TIME;
 	unsigned long old_flags = current->flags;
 	int max_restart = MAX_SOFTIRQ_RESTART;
+	u64 start = sched_clock();
 	struct softirq_action *h;
 	unsigned long pending;
 	unsigned int vec_nr;
@@ -584,7 +585,7 @@  asmlinkage __visible void __softirq_entry __do_softirq(void)
 
 	pending = local_softirq_pending();
 	if (pending) {
-		if (time_before(jiffies, end) && !need_resched() &&
+		if (sched_clock() - start < MAX_SOFTIRQ_TIME && !need_resched() &&
 		    --max_restart)
 			goto restart;