Message ID | 20230217-coverity-fixes-v1-3-043fac896a40@intel.com |
---|---|
State | Accepted |
Commit | 82a0cda56e10c1167e7e19a651d2dce7debc3b7e |
Headers | show |
Series | cxl/monitor: coverity and misc other fixes | expand |
On 2/17/23 5:40 PM, Vishal Verma wrote: > This test failed intermittently because the ndctl-list operation right > after a 'modprobe cxl_test' could race the actual nmem devices getting > loaded. > > Since CXL device probes are asynchronous, and cxl_acpi might've kicked > off a cxl_bus_rescan(), a cxl_flush() (via cxl_wait_probe()) can ensure > everything is loaded. > > Add a plain cxl-list right after the modprobe to allow for a flush/wait > cycle. > > Cc: Dave Jiang <dave.jiang@intel.com> > Suggested-by: Dan Williams <dan.j.williams@intel.com> > Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> > --- > test/security.sh | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/test/security.sh b/test/security.sh > index 04f630e..fb04aa6 100755 > --- a/test/security.sh > +++ b/test/security.sh > @@ -225,6 +225,7 @@ if [ "$uid" -ne 0 ]; then > fi > > modprobe "$KMOD_TEST" > +cxl list > setup > check_prereq "keyctl" > rc=1 >
On Fri, Feb 17, 2023 at 05:40:24PM -0700, Vishal Verma wrote: > This test failed intermittently because the ndctl-list operation right > after a 'modprobe cxl_test' could race the actual nmem devices getting > loaded. > > Since CXL device probes are asynchronous, and cxl_acpi might've kicked > off a cxl_bus_rescan(), a cxl_flush() (via cxl_wait_probe()) can ensure > everything is loaded. > > Add a plain cxl-list right after the modprobe to allow for a flush/wait > cycle. Is this the preferred method to 'settle', instead of udevadm settle? > > Cc: Dave Jiang <dave.jiang@intel.com> > Suggested-by: Dan Williams <dan.j.williams@intel.com> > Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> > --- > test/security.sh | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/test/security.sh b/test/security.sh > index 04f630e..fb04aa6 100755 > --- a/test/security.sh > +++ b/test/security.sh > @@ -225,6 +225,7 @@ if [ "$uid" -ne 0 ]; then > fi > > modprobe "$KMOD_TEST" > +cxl list > setup > check_prereq "keyctl" > rc=1 > > -- > 2.39.1 > >
Alison Schofield wrote: > On Fri, Feb 17, 2023 at 05:40:24PM -0700, Vishal Verma wrote: > > This test failed intermittently because the ndctl-list operation right > > after a 'modprobe cxl_test' could race the actual nmem devices getting > > loaded. > > > > Since CXL device probes are asynchronous, and cxl_acpi might've kicked > > off a cxl_bus_rescan(), a cxl_flush() (via cxl_wait_probe()) can ensure > > everything is loaded. > > > > Add a plain cxl-list right after the modprobe to allow for a flush/wait > > cycle. > > Is this the preferred method to 'settle', instead of udevadm settle? In general, 'udevadm settle' is only the first phase of flushing subsystem setup work in that it can only flush the udev event queue. I.e. a device arriving kicks off a KOBJ_ADD event, and once that event is processed without kicking off more KOBJ_* events the queue is settled. For CXL this means that the cxl_acpi and cxl_pci modules are loaded, but the asynchronous work they kick off to probe devices and rescan the bus is invisible to 'udevadm settle'. Internally 'cxl list' is performing a 'udevadm settle' and then flushing the follow on async work.
On Tue, 2023-02-21 at 09:22 -0800, Alison Schofield wrote: > On Fri, Feb 17, 2023 at 05:40:24PM -0700, Vishal Verma wrote: > > This test failed intermittently because the ndctl-list operation right > > after a 'modprobe cxl_test' could race the actual nmem devices getting > > loaded. > > > > Since CXL device probes are asynchronous, and cxl_acpi might've kicked > > off a cxl_bus_rescan(), a cxl_flush() (via cxl_wait_probe()) can ensure > > everything is loaded. > > > > Add a plain cxl-list right after the modprobe to allow for a flush/wait > > cycle. > > Is this the preferred method to 'settle', instead of udevadm settle? Generally, no. Usually cxl tests would use cxl-cli commands, which now have the necessary waits via cxl_wait_probe(), so even a 'udevadm settle' shouldn't be needed. In this case, the first thing we run is ndctl list, which waits for nvdimm things to 'settle', but we were racing with cxl_test coming up, which it (ndctl) knows nothing about. > > > > > Cc: Dave Jiang <dave.jiang@intel.com> > > Suggested-by: Dan Williams <dan.j.williams@intel.com> > > Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> > > --- > > test/security.sh | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/test/security.sh b/test/security.sh > > index 04f630e..fb04aa6 100755 > > --- a/test/security.sh > > +++ b/test/security.sh > > @@ -225,6 +225,7 @@ if [ "$uid" -ne 0 ]; then > > fi > > > > modprobe "$KMOD_TEST" > > +cxl list > > setup > > check_prereq "keyctl" > > rc=1 > > > > -- > > 2.39.1 > > > >
On Tue, Feb 21, 2023 at 10:13:16AM -0800, Vishal Verma wrote: > On Tue, 2023-02-21 at 09:22 -0800, Alison Schofield wrote: > > On Fri, Feb 17, 2023 at 05:40:24PM -0700, Vishal Verma wrote: > > > This test failed intermittently because the ndctl-list operation right > > > after a 'modprobe cxl_test' could race the actual nmem devices getting > > > loaded. > > > > > > Since CXL device probes are asynchronous, and cxl_acpi might've kicked > > > off a cxl_bus_rescan(), a cxl_flush() (via cxl_wait_probe()) can ensure > > > everything is loaded. > > > > > > Add a plain cxl-list right after the modprobe to allow for a flush/wait > > > cycle. > > > > Is this the preferred method to 'settle', instead of udevadm settle? > > Generally, no. Usually cxl tests would use cxl-cli commands, which now > have the necessary waits via cxl_wait_probe(), so even a 'udevadm > settle' shouldn't be needed. > > In this case, the first thing we run is ndctl list, which waits for > nvdimm things to 'settle', but we were racing with cxl_test coming up, > which it (ndctl) knows nothing about. OK - I'll stop doing the udevadm settle, since what I'm doing is pure 'cxl' and modprobe is always followed by cxl-cli commands. > > > > > > > > > Cc: Dave Jiang <dave.jiang@intel.com> > > > Suggested-by: Dan Williams <dan.j.williams@intel.com> > > > Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> > > > --- > > > test/security.sh | 1 + > > > 1 file changed, 1 insertion(+) > > > > > > diff --git a/test/security.sh b/test/security.sh > > > index 04f630e..fb04aa6 100755 > > > --- a/test/security.sh > > > +++ b/test/security.sh > > > @@ -225,6 +225,7 @@ if [ "$uid" -ne 0 ]; then > > > fi > > > > > > modprobe "$KMOD_TEST" > > > +cxl list > > > setup > > > check_prereq "keyctl" > > > rc=1 > > > > > > -- > > > 2.39.1 > > > > > > >
diff --git a/test/security.sh b/test/security.sh index 04f630e..fb04aa6 100755 --- a/test/security.sh +++ b/test/security.sh @@ -225,6 +225,7 @@ if [ "$uid" -ne 0 ]; then fi modprobe "$KMOD_TEST" +cxl list setup check_prereq "keyctl" rc=1
This test failed intermittently because the ndctl-list operation right after a 'modprobe cxl_test' could race the actual nmem devices getting loaded. Since CXL device probes are asynchronous, and cxl_acpi might've kicked off a cxl_bus_rescan(), a cxl_flush() (via cxl_wait_probe()) can ensure everything is loaded. Add a plain cxl-list right after the modprobe to allow for a flush/wait cycle. Cc: Dave Jiang <dave.jiang@intel.com> Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> --- test/security.sh | 1 + 1 file changed, 1 insertion(+)