Message ID | 20200320092428.20880-11-eric.auger@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | arm/arm64: Add ITS tests | expand |
Hi Eric, On 2020/3/20 17:24, Eric Auger wrote: > Triggers LPIs through the INT command. > > the test checks the LPI hits the right CPU and triggers > the right LPI intid, ie. the translation is correct. > > Updates to the config table also are tested, along with inv > and invall commands. > > Signed-off-by: Eric Auger <eric.auger@redhat.com> [...] So I've tested this series and found that the "INT" test will sometimes fail. "not ok 12 - gicv3: its-migration: dev2/eventid=20 triggers LPI 8195 en PE #3 after migration not ok 13 - gicv3: its-migration: dev7/eventid=255 triggers LPI 8196 on PE #2 after migration" From logs: "INFO: gicv3: its-migration: Migration complete INT dev_id=2 event_id=20 INFO: gicv3: its-migration: No LPI received whereas (cpuid=3, intid=8195) was expected FAIL: gicv3: its-migration: dev2/eventid=20 triggers LPI 8195 en PE #3 after migration INT dev_id=7 event_id=255 INFO: gicv3: its-migration: No LPI received whereas (cpuid=2, intid=8196) was expected FAIL: gicv3: its-migration: dev7/eventid=255 triggers LPI 8196 on PE #2 after migration" > +static void check_lpi_stats(const char *msg) > +{ > + bool pass = false; > + > + mdelay(100); After changing this to 'mdelay(1000)', the above error doesn't show up anymore. But it sounds strange that 100ms is not enough to deliver a single LPI. I haven't dig it further but will get back here later. > + smp_rmb(); /* pairs with wmb in lpi_handler */ > + if (lpi_stats.observed.cpu_id != lpi_stats.expected.cpu_id || > + lpi_stats.observed.lpi_id != lpi_stats.expected.lpi_id) { > + if (lpi_stats.observed.cpu_id == -1 && > + lpi_stats.observed.lpi_id == -1) { > + report_info("No LPI received whereas (cpuid=%d, intid=%d) " > + "was expected", lpi_stats.expected.cpu_id, > + lpi_stats.expected.lpi_id); > + } else { > + report_info("Unexpected LPI (cpuid=%d, intid=%d)", > + lpi_stats.observed.cpu_id, > + lpi_stats.observed.lpi_id); > + } > + } else { > + pass = true; > + } > + report(pass, "%s", msg); > +} This patch itself looks good to me, Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Thanks
Hi Zenghui, On 3/30/20 12:43 PM, Zenghui Yu wrote: > Hi Eric, > > On 2020/3/20 17:24, Eric Auger wrote: >> Triggers LPIs through the INT command. >> >> the test checks the LPI hits the right CPU and triggers >> the right LPI intid, ie. the translation is correct. >> >> Updates to the config table also are tested, along with inv >> and invall commands. >> >> Signed-off-by: Eric Auger <eric.auger@redhat.com> > > [...] > > So I've tested this series and found that the "INT" test will sometimes > fail. > > "not ok 12 - gicv3: its-migration: dev2/eventid=20 triggers LPI 8195 en > PE #3 after migration > not ok 13 - gicv3: its-migration: dev7/eventid=255 triggers LPI 8196 on > PE #2 after migration" > > From logs: > "INFO: gicv3: its-migration: Migration complete > INT dev_id=2 event_id=20 > INFO: gicv3: its-migration: No LPI received whereas (cpuid=3, > intid=8195) was expected > FAIL: gicv3: its-migration: dev2/eventid=20 triggers LPI 8195 en PE #3 > after migration > INT dev_id=7 event_id=255 > INFO: gicv3: its-migration: No LPI received whereas (cpuid=2, > intid=8196) was expected > FAIL: gicv3: its-migration: dev7/eventid=255 triggers LPI 8196 on PE #2 > after migration" > >> +static void check_lpi_stats(const char *msg) >> +{ >> + bool pass = false; >> + >> + mdelay(100); > > After changing this to 'mdelay(1000)', the above error doesn't show up > anymore. But it sounds strange that 100ms is not enough to deliver a > single LPI. I haven't dig it further but will get back here later. Did you find some time to investigate this issue. Changing 100 to 1000 has a huge impact on the overall test duration and I don't think it is sensible. Could you see what is your minimal value that pass the tests? Thanks Eric > >> + smp_rmb(); /* pairs with wmb in lpi_handler */ >> + if (lpi_stats.observed.cpu_id != lpi_stats.expected.cpu_id || >> + lpi_stats.observed.lpi_id != lpi_stats.expected.lpi_id) { >> + if (lpi_stats.observed.cpu_id == -1 && >> + lpi_stats.observed.lpi_id == -1) { >> + report_info("No LPI received whereas (cpuid=%d, intid=%d) " >> + "was expected", lpi_stats.expected.cpu_id, >> + lpi_stats.expected.lpi_id); >> + } else { >> + report_info("Unexpected LPI (cpuid=%d, intid=%d)", >> + lpi_stats.observed.cpu_id, >> + lpi_stats.observed.lpi_id); >> + } >> + } else { >> + pass = true; >> + } >> + report(pass, "%s", msg); >> +} > > This patch itself looks good to me, > Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> > > > Thanks >
Hi Eric, On 2020/4/2 16:50, Auger Eric wrote: > Hi Zenghui, > > On 3/30/20 12:43 PM, Zenghui Yu wrote: >> Hi Eric, >> >> On 2020/3/20 17:24, Eric Auger wrote: >>> Triggers LPIs through the INT command. >>> >>> the test checks the LPI hits the right CPU and triggers >>> the right LPI intid, ie. the translation is correct. >>> >>> Updates to the config table also are tested, along with inv >>> and invall commands. >>> >>> Signed-off-by: Eric Auger <eric.auger@redhat.com> >> >> [...] >> >> So I've tested this series and found that the "INT" test will sometimes >> fail. >> >> "not ok 12 - gicv3: its-migration: dev2/eventid=20 triggers LPI 8195 en >> PE #3 after migration >> not ok 13 - gicv3: its-migration: dev7/eventid=255 triggers LPI 8196 on >> PE #2 after migration" >> >> From logs: >> "INFO: gicv3: its-migration: Migration complete >> INT dev_id=2 event_id=20 >> INFO: gicv3: its-migration: No LPI received whereas (cpuid=3, >> intid=8195) was expected >> FAIL: gicv3: its-migration: dev2/eventid=20 triggers LPI 8195 en PE #3 >> after migration >> INT dev_id=7 event_id=255 >> INFO: gicv3: its-migration: No LPI received whereas (cpuid=2, >> intid=8196) was expected >> FAIL: gicv3: its-migration: dev7/eventid=255 triggers LPI 8196 on PE #2 >> after migration" >> >>> +static void check_lpi_stats(const char *msg) >>> +{ >>> + bool pass = false; >>> + >>> + mdelay(100); >> >> After changing this to 'mdelay(1000)', the above error doesn't show up >> anymore. But it sounds strange that 100ms is not enough to deliver a >> single LPI. I haven't dig it further but will get back here later. > > Did you find some time to investigate this issue. Changing 100 to 1000 > has a huge impact on the overall test duration and I don't think it is > sensible. Could you see what is your minimal value that pass the tests? I can reproduce this issue with a very *low* probability so I failed to investigate it :-(. (It might because the LPI was delivered to a busy vcpu...) You can leave it as it is until someone else complain about it again. Or take the similar approach as check_acked() - wait up to 5s for the interrupt to be delivered, and bail out as soon as we see it. Thanks, Zenghui
On Thu, Apr 02, 2020 at 08:40:42PM +0800, Zenghui Yu wrote: > Hi Eric, > > On 2020/4/2 16:50, Auger Eric wrote: > > Hi Zenghui, > > > > On 3/30/20 12:43 PM, Zenghui Yu wrote: > > > Hi Eric, > > > > > > On 2020/3/20 17:24, Eric Auger wrote: > > > > Triggers LPIs through the INT command. > > > > > > > > the test checks the LPI hits the right CPU and triggers > > > > the right LPI intid, ie. the translation is correct. > > > > > > > > Updates to the config table also are tested, along with inv > > > > and invall commands. > > > > > > > > Signed-off-by: Eric Auger <eric.auger@redhat.com> > > > > > > [...] > > > > > > So I've tested this series and found that the "INT" test will sometimes > > > fail. > > > > > > "not ok 12 - gicv3: its-migration: dev2/eventid=20 triggers LPI 8195 en > > > PE #3 after migration > > > not ok 13 - gicv3: its-migration: dev7/eventid=255 triggers LPI 8196 on > > > PE #2 after migration" > > > > > > From logs: > > > "INFO: gicv3: its-migration: Migration complete > > > INT dev_id=2 event_id=20 > > > INFO: gicv3: its-migration: No LPI received whereas (cpuid=3, > > > intid=8195) was expected > > > FAIL: gicv3: its-migration: dev2/eventid=20 triggers LPI 8195 en PE #3 > > > after migration > > > INT dev_id=7 event_id=255 > > > INFO: gicv3: its-migration: No LPI received whereas (cpuid=2, > > > intid=8196) was expected > > > FAIL: gicv3: its-migration: dev7/eventid=255 triggers LPI 8196 on PE #2 > > > after migration" > > > > > > > +static void check_lpi_stats(const char *msg) > > > > +{ > > > > + bool pass = false; > > > > + > > > > + mdelay(100); > > > > > > After changing this to 'mdelay(1000)', the above error doesn't show up > > > anymore. But it sounds strange that 100ms is not enough to deliver a > > > single LPI. I haven't dig it further but will get back here later. > > > > Did you find some time to investigate this issue. Changing 100 to 1000 > > has a huge impact on the overall test duration and I don't think it is > > sensible. Could you see what is your minimal value that pass the tests? > > I can reproduce this issue with a very *low* probability so I failed > to investigate it :-(. (It might because the LPI was delivered to a > busy vcpu...) > > You can leave it as it is until someone else complain about it again. > Or take the similar approach as check_acked() - wait up to 5s for the > interrupt to be delivered, and bail out as soon as we see it. I think the check_acked approach would be the best approach. Thanks, drew > > > Thanks, > Zenghui > >
diff --git a/arm/gic.c b/arm/gic.c index 649ed81..5f1e595 100644 --- a/arm/gic.c +++ b/arm/gic.c @@ -159,6 +159,85 @@ static void ipi_handler(struct pt_regs *regs __unused) } } +static void setup_irq(irq_handler_fn handler) +{ + gic_enable_defaults(); +#ifdef __arm__ + install_exception_handler(EXCPTN_IRQ, handler); +#else + install_irq_handler(EL1H_IRQ, handler); +#endif + local_irq_enable(); +} + +#if defined(__aarch64__) +struct its_event { + int cpu_id; + int lpi_id; +}; + +struct its_stats { + struct its_event expected; + struct its_event observed; +}; + +static struct its_stats lpi_stats; + +static void lpi_handler(struct pt_regs *regs __unused) +{ + u32 irqstat = gic_read_iar(); + int irqnr = gic_iar_irqnr(irqstat); + + gic_write_eoir(irqstat); + assert(irqnr >= 8192); + smp_rmb(); /* pairs with wmb in lpi_stats_expect */ + lpi_stats.observed.cpu_id = smp_processor_id(); + lpi_stats.observed.lpi_id = irqnr; + smp_wmb(); /* pairs with rmb in check_lpi_stats */ +} + +static void lpi_stats_expect(int exp_cpu_id, int exp_lpi_id) +{ + lpi_stats.expected.cpu_id = exp_cpu_id; + lpi_stats.expected.lpi_id = exp_lpi_id; + lpi_stats.observed.cpu_id = -1; + lpi_stats.observed.lpi_id = -1; + smp_wmb(); /* pairs with rmb in handler */ +} + +static void check_lpi_stats(const char *msg) +{ + bool pass = false; + + mdelay(100); + smp_rmb(); /* pairs with wmb in lpi_handler */ + if (lpi_stats.observed.cpu_id != lpi_stats.expected.cpu_id || + lpi_stats.observed.lpi_id != lpi_stats.expected.lpi_id) { + if (lpi_stats.observed.cpu_id == -1 && + lpi_stats.observed.lpi_id == -1) { + report_info("No LPI received whereas (cpuid=%d, intid=%d) " + "was expected", lpi_stats.expected.cpu_id, + lpi_stats.expected.lpi_id); + } else { + report_info("Unexpected LPI (cpuid=%d, intid=%d)", + lpi_stats.observed.cpu_id, + lpi_stats.observed.lpi_id); + } + } else { + pass = true; + } + report(pass, "%s", msg); +} + +static void secondary_lpi_test(void) +{ + setup_irq(lpi_handler); + cpumask_set_cpu(smp_processor_id(), &ready); + while (1) + wfi(); +} +#endif + static void gicv2_ipi_send_self(void) { writel(2 << 24 | IPI_IRQ, gicv2_dist_base() + GICD_SGIR); @@ -216,17 +295,6 @@ static void ipi_test_smp(void) report_prefix_pop(); } -static void setup_irq(irq_handler_fn handler) -{ - gic_enable_defaults(); -#ifdef __arm__ - install_exception_handler(EXCPTN_IRQ, handler); -#else - install_irq_handler(EL1H_IRQ, handler); -#endif - local_irq_enable(); -} - static void ipi_send(void) { setup_irq(ipi_handler); @@ -521,6 +589,7 @@ static void gic_test_mmio(void) #if defined(__arm__) static void test_its_introspection(void) {} +static void test_its_trigger(void) {} #else /* __aarch64__ */ @@ -559,6 +628,126 @@ static void test_its_introspection(void) report_info("collection table entry_size = 0x%x", coll_baser->esz); } +static int its_prerequisites(int nb_cpus) +{ + int cpu; + + if (!gicv3_its_base()) { + report_skip("No ITS, skip ..."); + return -1; + } + + if (nr_cpus < nb_cpus) { + report_skip("Test requires at least %d vcpus", nb_cpus); + return -1; + } + + stats_reset(); + + setup_irq(lpi_handler); + + for_each_present_cpu(cpu) { + if (cpu == 0) + continue; + smp_boot_secondary(cpu, secondary_lpi_test); + } + wait_on_ready(); + + its_enable_defaults(); + + return 0; +} + +static void test_its_trigger(void) +{ + struct its_collection *col3, *col2; + struct its_device *dev2, *dev7; + + if (its_prerequisites(4)) + return; + + dev2 = its_create_device(2 /* dev id */, 8 /* nb_ites */); + dev7 = its_create_device(7 /* dev id */, 8 /* nb_ites */); + + col3 = its_create_collection(3 /* col id */, 3/* target PE */); + col2 = its_create_collection(2 /* col id */, 2/* target PE */); + + gicv3_lpi_set_config(8195, LPI_PROP_DEFAULT); + gicv3_lpi_set_config(8196, LPI_PROP_DEFAULT); + + report_prefix_push("int"); + /* + * dev=2, eventid=20 -> lpi= 8195, col=3 + * dev=7, eventid=255 -> lpi= 8196, col=2 + * Trigger dev2, eventid=20 and dev7, eventid=255 + * Check both LPIs hit + */ + + its_send_mapd(dev2, true); + its_send_mapd(dev7, true); + + its_send_mapc(col3, true); + its_send_mapc(col2, true); + + its_send_invall(col2); + its_send_invall(col3); + + its_send_mapti(dev2, 8195 /* lpi id */, 20 /* event id */, col3); + its_send_mapti(dev7, 8196 /* lpi id */, 255 /* event id */, col2); + + lpi_stats_expect(3, 8195); + its_send_int(dev2, 20); + check_lpi_stats("dev=2, eventid=20 -> lpi= 8195, col=3"); + + lpi_stats_expect(2, 8196); + its_send_int(dev7, 255); + check_lpi_stats("dev=7, eventid=255 -> lpi= 8196, col=2"); + + report_prefix_pop(); + + report_prefix_push("inv/invall"); + + /* + * disable 8195, check dev2/eventid=20 does not trigger the + * corresponding LPI + */ + gicv3_lpi_set_config(8195, LPI_PROP_DEFAULT & ~LPI_PROP_ENABLED); + its_send_inv(dev2, 20); + + lpi_stats_expect(-1, -1); + its_send_int(dev2, 20); + check_lpi_stats("dev2/eventid=20 does not trigger any LPI"); + + /* + * re-enable the LPI but willingly do not call invall + * so the change in config is not taken into account. + * The LPI should not hit + */ + gicv3_lpi_set_config(8195, LPI_PROP_DEFAULT); + lpi_stats_expect(-1, -1); + its_send_int(dev2, 20); + check_lpi_stats("dev2/eventid=20 still does not trigger any LPI"); + + /* Now call the invall and check the LPI hits */ + its_send_invall(col3); + lpi_stats_expect(3, 8195); + its_send_int(dev2, 20); + check_lpi_stats("dev2/eventid=20 now triggers an LPI"); + + report_prefix_pop(); + + report_prefix_push("mapd valid=false"); + /* + * Unmap device 2 and check the eventid 20 formerly + * attached to it does not hit anymore + */ + + its_send_mapd(dev2, false); + lpi_stats_expect(-1, -1); + its_send_int(dev2, 20); + check_lpi_stats("no LPI after device unmap"); + report_prefix_pop(); +} #endif int main(int argc, char **argv) @@ -592,6 +781,10 @@ int main(int argc, char **argv) report_prefix_push(argv[1]); gic_test_mmio(); report_prefix_pop(); + } else if (!strcmp(argv[1], "its-trigger")) { + report_prefix_push(argv[1]); + test_its_trigger(); + report_prefix_pop(); } else if (strcmp(argv[1], "its-introspection") == 0) { report_prefix_push(argv[1]); test_its_introspection(); diff --git a/arm/unittests.cfg b/arm/unittests.cfg index 23d378e..b9a7a2c 100644 --- a/arm/unittests.cfg +++ b/arm/unittests.cfg @@ -129,6 +129,13 @@ extra_params = -machine gic-version=3 -append 'its-introspection' groups = its arch = arm64 +[its-trigger] +file = gic.flat +smp = $MAX_SMP +extra_params = -machine gic-version=3 -append 'its-trigger' +groups = its +arch = arm64 + # Test PSCI emulation [psci] file = psci.flat
Triggers LPIs through the INT command. the test checks the LPI hits the right CPU and triggers the right LPI intid, ie. the translation is correct. Updates to the config table also are tested, along with inv and invall commands. Signed-off-by: Eric Auger <eric.auger@redhat.com> --- v5 -> v6: - removed collection-unmap test - moved the collection invalidation after the MAPCs v4 -> v5: - move the test stub from the header to arm/gic.c v3 -> v4: - assert in lpi_handler if the interrupt is not an LPI - remove check_lpi_stats from its_prerequisites() v2 -> v3: - add comments - keep the report_skip in case there aren't 4 vcpus to be able to run other tests in the its category. - fix the prefix pop - move its_event and its_stats to arm/gic.c --- arm/gic.c | 215 +++++++++++++++++++++++++++++++++++++++++++--- arm/unittests.cfg | 7 ++ 2 files changed, 211 insertions(+), 11 deletions(-)