Message ID | 238e837b8d4e17925801c4e85de17bdfca4ddd00.1598042152.git.anchalag@amazon.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Fix PM hibernation in Xen guests | expand |
On 8/21/20 6:30 PM, Anchal Agarwal wrote: > Save/restore xen_sched_clock_offset in syscore suspend/resume during PM > hibernation. Commit '867cefb4cb1012: ("xen: Fix x86 sched_clock() interface > for xen")' fixes xen guest time handling during migration. A similar issue > is seen during PM hibernation when system runs CPU intensive workload. > Post resume pvclock resets the value to 0 however, xen sched_clock_offset > is never updated. System instability is seen during resume from hibernation > when system is under heavy CPU load. Since xen_sched_clock_offset is not > updated, system does not see the monotonic clock value and the scheduler > would then think that heavy CPU hog tasks need more time in CPU, causing > the system to freeze I don't think you need to explain why non-monotonic clocks are bad. (and, in fact, the same applies to commit message in patch 8) > > Signed-off-by: Anchal Agarwal <anchalag@amazon.com> > --- > arch/x86/xen/suspend.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/xen/suspend.c b/arch/x86/xen/suspend.c > index b12db6966af6..a62e08a11681 100644 > --- a/arch/x86/xen/suspend.c > +++ b/arch/x86/xen/suspend.c > @@ -98,8 +98,9 @@ static int xen_syscore_suspend(void) > return 0; > > gnttab_suspend(); > - > xen_manage_runstate_time(-1); > + xen_save_sched_clock_offset(); > + > xrfp.domid = DOMID_SELF; > xrfp.gpfn = __pa(HYPERVISOR_shared_info) >> PAGE_SHIFT; > > @@ -120,6 +121,12 @@ static void xen_syscore_resume(void) > xen_hvm_map_shared_info(); > > pvclock_resume(); > + > + /* > + * Restore xen_sched_clock_offset during resume to maintain > + * monotonic clock value > + */ I'd drop this comment, we know what the call does. -boris > + xen_restore_sched_clock_offset(); > xen_manage_runstate_time(0); > gnttab_resume(); > }
diff --git a/arch/x86/xen/suspend.c b/arch/x86/xen/suspend.c index b12db6966af6..a62e08a11681 100644 --- a/arch/x86/xen/suspend.c +++ b/arch/x86/xen/suspend.c @@ -98,8 +98,9 @@ static int xen_syscore_suspend(void) return 0; gnttab_suspend(); - xen_manage_runstate_time(-1); + xen_save_sched_clock_offset(); + xrfp.domid = DOMID_SELF; xrfp.gpfn = __pa(HYPERVISOR_shared_info) >> PAGE_SHIFT; @@ -120,6 +121,12 @@ static void xen_syscore_resume(void) xen_hvm_map_shared_info(); pvclock_resume(); + + /* + * Restore xen_sched_clock_offset during resume to maintain + * monotonic clock value + */ + xen_restore_sched_clock_offset(); xen_manage_runstate_time(0); gnttab_resume(); }
Save/restore xen_sched_clock_offset in syscore suspend/resume during PM hibernation. Commit '867cefb4cb1012: ("xen: Fix x86 sched_clock() interface for xen")' fixes xen guest time handling during migration. A similar issue is seen during PM hibernation when system runs CPU intensive workload. Post resume pvclock resets the value to 0 however, xen sched_clock_offset is never updated. System instability is seen during resume from hibernation when system is under heavy CPU load. Since xen_sched_clock_offset is not updated, system does not see the monotonic clock value and the scheduler would then think that heavy CPU hog tasks need more time in CPU, causing the system to freeze Signed-off-by: Anchal Agarwal <anchalag@amazon.com> --- arch/x86/xen/suspend.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)