Message ID | 20230530124056.18332-2-pmorel@linux.ibm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Fixing infinite loop on SCLP READ SCP INFO error | expand |
Quoting Pierre Morel (2023-05-30 14:40:55) > A kvm-unit-test would hang if an abort happens before SCLP Read SCP > Information has completed if sclp_get_cpu_num() does not report at > least one CPU. > Since we obviously have one, report it. Sorry for complaining again, in a discussion with Janosch we found that the description and commit below can be easily misunderstood. I suggest the following wording in the commit description: s390x: sclp: treat system as single processor when read_info is NULL When a test abort()s before SCLP read info is completed, the assertion on read_info in sclp_read_info() will fail. Since abort() eventually calls smp_teardown() which in turn calls sclp_get_cpu_num(), this will cause an infinite abort() chain, causing the test to hang. Fix this by considering the system single processor when read_info is missing. [...] > diff --git a/lib/s390x/sclp.c b/lib/s390x/sclp.c > index 12919ca..34a31da 100644 > --- a/lib/s390x/sclp.c > +++ b/lib/s390x/sclp.c > @@ -121,6 +121,12 @@ int sclp_get_cpu_num(void) > { > if (read_info) > return read_info->entries_cpu; > + /* > + * If we fail here and read_info has not being set, > + * it means we failed early and we try to abort the test. > + * We need to return at least one CPU, and obviously we have > + * at least one, for the smp_teardown to correctly work. > + */ Please make this: Don't abort here if read_info is NULL since abort() calls smp_teardown() which eventually calls this function and thus causes an infinite abort() chain, causing the test to hang. Since we obviously have at least one CPU, just return one. With these changes: Reviewed-by: Nico Boehr <nrb@linux.ibm.com> Sorry for the back and forth.
On 6/1/23 13:52, Nico Boehr wrote: > Quoting Pierre Morel (2023-05-30 14:40:55) >> A kvm-unit-test would hang if an abort happens before SCLP Read SCP >> Information has completed if sclp_get_cpu_num() does not report at >> least one CPU. >> Since we obviously have one, report it. > Sorry for complaining again, in a discussion with Janosch we found that the > description and commit below can be easily misunderstood. I suggest the > following wording in the commit description: > > s390x: sclp: treat system as single processor when read_info is NULL > > When a test abort()s before SCLP read info is completed, the assertion on > read_info in sclp_read_info() will fail. Since abort() eventually calls > smp_teardown() which in turn calls sclp_get_cpu_num(), this will cause an > infinite abort() chain, causing the test to hang. > > Fix this by considering the system single processor when read_info is missing. > > [...] better, I take it thx >> diff --git a/lib/s390x/sclp.c b/lib/s390x/sclp.c >> index 12919ca..34a31da 100644 >> --- a/lib/s390x/sclp.c >> +++ b/lib/s390x/sclp.c >> @@ -121,6 +121,12 @@ int sclp_get_cpu_num(void) >> { >> if (read_info) >> return read_info->entries_cpu; >> + /* >> + * If we fail here and read_info has not being set, >> + * it means we failed early and we try to abort the test. >> + * We need to return at least one CPU, and obviously we have >> + * at least one, for the smp_teardown to correctly work. >> + */ > Please make this: > > Don't abort here if read_info is NULL since abort() calls smp_teardown() which > eventually calls this function and thus causes an infinite abort() chain, > causing the test to hang. Since we obviously have at least one CPU, just return > one. > > With these changes: > > Reviewed-by: Nico Boehr <nrb@linux.ibm.com> > > Sorry for the back and forth.
diff --git a/lib/s390x/sclp.c b/lib/s390x/sclp.c index 12919ca..34a31da 100644 --- a/lib/s390x/sclp.c +++ b/lib/s390x/sclp.c @@ -121,6 +121,12 @@ int sclp_get_cpu_num(void) { if (read_info) return read_info->entries_cpu; + /* + * If we fail here and read_info has not being set, + * it means we failed early and we try to abort the test. + * We need to return at least one CPU, and obviously we have + * at least one, for the smp_teardown to correctly work. + */ return 1; }
A kvm-unit-test would hang if an abort happens before SCLP Read SCP Information has completed if sclp_get_cpu_num() does not report at least one CPU. Since we obviously have one, report it. Signed-off-by: Pierre Morel <pmorel@linux.ibm.com> --- lib/s390x/sclp.c | 6 ++++++ 1 file changed, 6 insertions(+)