Message ID | 20220525121030.16054-8-Dragan.Mladjenovic@syrmia.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | MIPS: Support I6500 multi-cluster configuration | expand |
On 25-May-22 14:10, Dragan Mladjenovic wrote: > From: Paul Burton <paulburton@kernel.org> > > In a multi-cluster MIPS system we have multiple GICs - one in each > cluster - each of which has its own independent counter. The counters in > each GIC are not synchronised in any way, so they can drift relative to > one another through the lifetime of the system. This is problematic for > a clocksource which ought to be global. > > Avoid problems by always accessing cluster 0's counter, using > cross-cluster register access. This adds overhead so we only do so on > systems where we actually have CPUs present in multiple clusters. > For now, be extra conservative and don't use gic counter for vdso or > sched_clock in this case. > > Signed-off-by: Paul Burton <paulburton@kernel.org> > Signed-off-by: Chao-ying Fu <cfu@wavecomp.com> > Signed-off-by: Dragan Mladjenovic <dragan.mladjenovic@syrmia.com> > > diff --git a/drivers/clocksource/mips-gic-timer.c b/drivers/clocksource/mips-gic-timer.c > index be4175f415ba..6632d314a2c0 100644 > --- a/drivers/clocksource/mips-gic-timer.c > +++ b/drivers/clocksource/mips-gic-timer.c > @@ -170,6 +170,37 @@ static u64 gic_hpt_read(struct clocksource *cs) > return gic_read_count(); > } > > +static u64 gic_hpt_read_multicluster(struct clocksource *cs) > +{ > + unsigned int hi, hi2, lo; > + u64 count; > + > + mips_cm_lock_other(0, 0, 0, CM_GCR_Cx_OTHER_BLOCK_GLOBAL); > + > + if (mips_cm_is64) { > + count = read_gic_redir_counter(); > + goto out; > + } > + > + hi = read_gic_redir_counter_32h(); > + while (true) { > + lo = read_gic_redir_counter_32l(); > + > + /* If hi didn't change then lo didn't wrap & we're done */ > + hi2 = read_gic_redir_counter_32h(); > + if (hi2 == hi) > + break; > + > + /* Otherwise, repeat with the latest hi value */ > + hi = hi2; > + } > + > + count = (((u64)hi) << 32) + lo; > +out: > + mips_cm_unlock_other(); > + return count; > +} > + > static struct clocksource gic_clocksource = { > .name = "GIC", > .read = gic_hpt_read, > @@ -204,6 +235,11 @@ static int __init __gic_clocksource_init(void) > /* Calculate a somewhat reasonable rating value. */ > gic_clocksource.rating = 200 + gic_frequency / 10000000; > > + if (mips_cps_multicluster_cpus()) { > + gic_clocksource.read = &gic_hpt_read_multicluster; > + gic_clocksource.vdso_clock_mode = VDSO_CLOCKMODE_NONE; > + } > + > ret = clocksource_register_hz(&gic_clocksource, gic_frequency); > if (ret < 0) > pr_warn("Unable to register clocksource\n"); > @@ -262,7 +298,8 @@ static int __init gic_clocksource_of_init(struct device_node *node) > * stable CPU frequency or on the platforms with CM3 and CPU frequency > * change performed by the CPC core clocks divider. > */ > - if (mips_cm_revision() >= CM_REV_CM3 || !IS_ENABLED(CONFIG_CPU_FREQ)) { > + if ((mips_cm_revision() >= CM_REV_CM3 || !IS_ENABLED(CONFIG_CPU_FREQ)) && > + !mips_cps_multicluster_cpus()) { > sched_clock_register(mips_cm_is64 ? > gic_read_count_64 : gic_read_count_2x32, > 64, gic_frequency); Hi, I was expecting some comments on this, but I'll ask first. We now taking a conservative approach of not using gic as sched_clock in multicluster case. Is this necessary or can sched_clock tolerate a fixed delta between clocks on different cpu clusters? Best regards, Dragan
On 2022-06-27 15:17, Dragan Mladjenovic wrote: > On 25-May-22 14:10, Dragan Mladjenovic wrote: >> From: Paul Burton <paulburton@kernel.org> >> >> In a multi-cluster MIPS system we have multiple GICs - one in each >> cluster - each of which has its own independent counter. The counters >> in >> each GIC are not synchronised in any way, so they can drift relative >> to >> one another through the lifetime of the system. This is problematic >> for >> a clocksource which ought to be global. >> >> Avoid problems by always accessing cluster 0's counter, using >> cross-cluster register access. This adds overhead so we only do so on >> systems where we actually have CPUs present in multiple clusters. >> For now, be extra conservative and don't use gic counter for vdso or >> sched_clock in this case. >> >> Signed-off-by: Paul Burton <paulburton@kernel.org> >> Signed-off-by: Chao-ying Fu <cfu@wavecomp.com> >> Signed-off-by: Dragan Mladjenovic <dragan.mladjenovic@syrmia.com> >> >> diff --git a/drivers/clocksource/mips-gic-timer.c >> b/drivers/clocksource/mips-gic-timer.c >> index be4175f415ba..6632d314a2c0 100644 >> --- a/drivers/clocksource/mips-gic-timer.c >> +++ b/drivers/clocksource/mips-gic-timer.c >> @@ -170,6 +170,37 @@ static u64 gic_hpt_read(struct clocksource *cs) >> return gic_read_count(); >> } >> +static u64 gic_hpt_read_multicluster(struct clocksource *cs) >> +{ >> + unsigned int hi, hi2, lo; >> + u64 count; >> + >> + mips_cm_lock_other(0, 0, 0, CM_GCR_Cx_OTHER_BLOCK_GLOBAL); >> + >> + if (mips_cm_is64) { >> + count = read_gic_redir_counter(); >> + goto out; >> + } >> + >> + hi = read_gic_redir_counter_32h(); >> + while (true) { >> + lo = read_gic_redir_counter_32l(); >> + >> + /* If hi didn't change then lo didn't wrap & we're done */ >> + hi2 = read_gic_redir_counter_32h(); >> + if (hi2 == hi) >> + break; >> + >> + /* Otherwise, repeat with the latest hi value */ >> + hi = hi2; >> + } >> + >> + count = (((u64)hi) << 32) + lo; >> +out: >> + mips_cm_unlock_other(); >> + return count; >> +} >> + >> static struct clocksource gic_clocksource = { >> .name = "GIC", >> .read = gic_hpt_read, >> @@ -204,6 +235,11 @@ static int __init __gic_clocksource_init(void) >> /* Calculate a somewhat reasonable rating value. */ >> gic_clocksource.rating = 200 + gic_frequency / 10000000; >> + if (mips_cps_multicluster_cpus()) { >> + gic_clocksource.read = &gic_hpt_read_multicluster; >> + gic_clocksource.vdso_clock_mode = VDSO_CLOCKMODE_NONE; >> + } >> + >> ret = clocksource_register_hz(&gic_clocksource, gic_frequency); >> if (ret < 0) >> pr_warn("Unable to register clocksource\n"); >> @@ -262,7 +298,8 @@ static int __init gic_clocksource_of_init(struct >> device_node *node) >> * stable CPU frequency or on the platforms with CM3 and CPU >> frequency >> * change performed by the CPC core clocks divider. >> */ >> - if (mips_cm_revision() >= CM_REV_CM3 || >> !IS_ENABLED(CONFIG_CPU_FREQ)) { >> + if ((mips_cm_revision() >= CM_REV_CM3 || >> !IS_ENABLED(CONFIG_CPU_FREQ)) && >> + !mips_cps_multicluster_cpus()) { >> sched_clock_register(mips_cm_is64 ? >> gic_read_count_64 : gic_read_count_2x32, >> 64, gic_frequency); > > Hi, > > I was expecting some comments on this, but I'll ask first. We now > taking a conservative approach of not using gic as sched_clock in > multicluster case. Is this necessary or can sched_clock tolerate a > fixed delta between clocks on different cpu clusters? I don't think that's wise. We generally go into all sort of troubles to keep sched_clock() strictly identical between CPUs, and there are tons of things that rely on this (the scheduler itself, but any sort of tracing...). You just have to grep for the various use cases. A consequence of the above is that the kernel can (and will) snapshot a sched_clock value, and compare it to the value on the current CPU. Imagine what happens if the difference is negative... So I don't know what the deal is with the MIPS GIC, but if any of the above can happen, you're doomed. M.
diff --git a/drivers/clocksource/mips-gic-timer.c b/drivers/clocksource/mips-gic-timer.c index be4175f415ba..6632d314a2c0 100644 --- a/drivers/clocksource/mips-gic-timer.c +++ b/drivers/clocksource/mips-gic-timer.c @@ -170,6 +170,37 @@ static u64 gic_hpt_read(struct clocksource *cs) return gic_read_count(); } +static u64 gic_hpt_read_multicluster(struct clocksource *cs) +{ + unsigned int hi, hi2, lo; + u64 count; + + mips_cm_lock_other(0, 0, 0, CM_GCR_Cx_OTHER_BLOCK_GLOBAL); + + if (mips_cm_is64) { + count = read_gic_redir_counter(); + goto out; + } + + hi = read_gic_redir_counter_32h(); + while (true) { + lo = read_gic_redir_counter_32l(); + + /* If hi didn't change then lo didn't wrap & we're done */ + hi2 = read_gic_redir_counter_32h(); + if (hi2 == hi) + break; + + /* Otherwise, repeat with the latest hi value */ + hi = hi2; + } + + count = (((u64)hi) << 32) + lo; +out: + mips_cm_unlock_other(); + return count; +} + static struct clocksource gic_clocksource = { .name = "GIC", .read = gic_hpt_read, @@ -204,6 +235,11 @@ static int __init __gic_clocksource_init(void) /* Calculate a somewhat reasonable rating value. */ gic_clocksource.rating = 200 + gic_frequency / 10000000; + if (mips_cps_multicluster_cpus()) { + gic_clocksource.read = &gic_hpt_read_multicluster; + gic_clocksource.vdso_clock_mode = VDSO_CLOCKMODE_NONE; + } + ret = clocksource_register_hz(&gic_clocksource, gic_frequency); if (ret < 0) pr_warn("Unable to register clocksource\n"); @@ -262,7 +298,8 @@ static int __init gic_clocksource_of_init(struct device_node *node) * stable CPU frequency or on the platforms with CM3 and CPU frequency * change performed by the CPC core clocks divider. */ - if (mips_cm_revision() >= CM_REV_CM3 || !IS_ENABLED(CONFIG_CPU_FREQ)) { + if ((mips_cm_revision() >= CM_REV_CM3 || !IS_ENABLED(CONFIG_CPU_FREQ)) && + !mips_cps_multicluster_cpus()) { sched_clock_register(mips_cm_is64 ? gic_read_count_64 : gic_read_count_2x32, 64, gic_frequency);